Moore's law repealed
Finally, there's the looming problem of the future of silicon process-size shrinkage: it can't go on forever. One obvious limit, as Segars pointed out, is the .27-nanometer diameter of the silicon atom itself – as process sizes shrink down to, say, 14nm and below, you're only talking about dozens of atoms per transistor gate.
But there are plenty of other challenges to be met before you'll count the number of silicon atoms in a gate on your fingers and toes – namely, what lithography techniques can take you well below 20nm?
At the 20nm level, Segars said, "the problem is that you need to introduce double patterning." Using two sets of masks to accomplish what one set could do at larger process sizes not only increases mask costs, but also slows down the throughput of the manufacturing.
If you want to keep the throughput at the same rate, you have to buy more equipment – which might make the ASML's of the world happier, but driving up chip costs is not a good thing in a world that includes developing nations whose citizens are hankering to join the mobile world.
There's one long-sought technology about which Segars remains a bit sceptical. "At 14 [nanometers] and below, what you really want is EUV," he said, referring to extreme ultraviolet lithography, which has long been seen as a possible solution to the process-shinking problem. EUV's promise comes from the fact that it's based on 13.5nm wavelength light – one hell of a lot more precise than the 193nm light that Segars said is used in today's visible-light lithography.
"The problem is," he said, "that [EUV] is really, really hard to make. You've got to make a plasma out of tin atoms, and then shoot it with a laser, and some light comes out – but the light's really weak, and it gets absorbed by everything. So generating enough of it to economically build chips is very, very hard."
And EUV technology is currently slow. "To have a fab running economically," Segars said, "you need to build about two to three hundred wafers an hour. EUV machines today can do about five."
Some observers question whether EUV will ever be a workable form of chip lithography. "If that's the case, then," Segars said, "frankly we're not quite sure what we're going to do."
From his pont of view, it's time for the microprocessor industry – in all its disaggregated chunks – to take a page from an old Apple ad campaign and, as he put it, "Think different".
The past is prolog. "Silicon scaling has been great." Segars reminisces. "We've gotten huge gains in power, performance, and area, but it's going to end somewhere, and that's going to affect how we do design and how we run our businesses, so my advice to you is get ready for that.
"It's coming sooner than a lot of people want to recognize."
After the endgame: core teamwork
But that's not to say that there's a dead-end on the road we're travelling. Segars' vision of the future jibes with the one described by fellow ARMian Jem Davies, the company's vice president of technology, when speaking at AMD's Fusion Summit this June – namely, that heterogeneous computing systems are the Next Big Thing.
Simply put, heterogeneous computing systems distribute a workload to various and sundry specialized compute engines – CPU, GPU, video, encryption, baseband, whatever – so that individual sub-tasks are completed efficiently by dedicated hardware best suited to them.
"I think the future of processing is heterogeneous multiprocessing," Segars said, "... dedicated engines arranged in various clusters with a software layer that can understand the underlying hardware, and make sure that if it's not needed, it's shut off, it's not leaking, to preserve that battery."
There are a host of challenges to achieving the holy heterogeous grail, of course – not the least of which being keeping all the various cores in close communication, and optimally data-coherent.
To that end, ARM's upcoming Cortex-A15 compute core – which will likely appear in early 2013 – will introduce a cache coherent interconnect that will enable full coherency among multiple CPU clusters. Segars also projects that by 2015, coherency in ARM-based SoCs and systems will be limited not only to CPUs, but will also allow full "where's that data?" transparency among CPUs, GPUs, and specialized engines.
The future of cache coherency, ARM style
Full coherence, however, brings with it its own set of challenges, such as unwanted latency when far-flung cores and engines need to share the same data, but ARM, AMD, and Intel are all looking into how different approaches to coherency can help – or hinder – heterogeneity.
A lot has changed in the microprocessor world since the Intel 4004 appeared 40 years ago this November. By and large, the arc of improvement has been relatively straightforward, with improvements in process size, processing power, and miniaturization being fairly regular – achieved through one hell of a lot of work, to be sure, but regular nonetheless.
There's been a lot of talk recently about the "post PC era". From Segars' point of view, however, we may also soon be talking about the "post–Moore's Law" era – a time when computing advances are no longer measured in transistor counts per square millimeter, but rather in how quickly, intelligently, and cooperatively different cores and engines can communicate. ®
35x the energy density of a phone battery
When discussing the woeful pace of battery-technology advances when compared to advances in silicon technology, Segars said: "What we really need is a new battery. If someone can work out how to hook up a chocolate bar into a cell phone, that'd be pretty good, because there's about 35 times the energy density in a bar of Cadbury Dairy Milk, and that might help solve our power problem."
ARM vet: The CPU's future is threatened
1x Dairy Milk Bar
1x fishing rod
1x treadmill with dynamo
1x 30-something single woman
Not all that portable admittedly, but I've got a patent pending on a nationwide network of charging stations :)
What on earth has happened here?
A thoughtful, intelligent, fascinating and well written article from which I learnt rather a lot. Without any jokes, satire or the faintest smell of clickbait in it. Have I logged on to the wrong site?
Is there actually a continuing market for slightly faster kit at higher cost in the current climate? IMHO most kit has been running fast enough for the last couple of years, despite constant efforts to force us to buy more CPU to support the same functionality.
Extreme gamers can link a few GPUs together, data warehousers can add terabytes of SD disk, and the rest of us can upgrade to Linux or Windows XP running Libre Office ;-)
This article suggests it's time for software to catch up with the hardware.
Back to the 70's then?
Maybe the way to make these devices to run faster is to tighten up the code. After all we've been getting rather a lot of bloat whilst Moore's Law has applied. In the 70's when processor time cost money it was a time when shaving the time off your code had a distinct advantage, and they didn't have cut'n'paste coders in that era.
I'd predict a trimming back of all those functions that don't get used unless it's the 5th Tuesday in February, to make what does get used rather a lot quicker.
Tux - possibly the home of better software.
Dedicated hardware best suited?
Isn't this rather obvious? The microprocessor exemplifies the concept of jack of all trades, master of none. Frankly the only reason my netbook is capable of showing me animé is because there is enough grunt power to decode the video data in real time. But then my PVR with a very slow ARM processor can do much the same as it pushes the difficult stuff to the on-chip DSP.
Likewise the older generation of MP3 players were essentially a Z80 core hooked to a small DSP, all capable of extracting ten hours out of a single AAA cell.
Go back even further, the Psion 3a was practically built upon this concept. Bloody great ASIC held an x86 clone (V30) and sound, display, interfacing, etc. Things were only powered up as they were actually required. In this way, a handheld device not unlike an original XT in spec could run for ages on a pair of double-As.
As the guy said, batteries are crap. Anybody who uses their smartphone like it's their new best friend will know that daily recharging is the norm, plus a car charger if using sat-nav. So with this in mind, it makes sense to have the main processor "capable" without being stunning, and push off complicated stuff to dedicated hardware better suited for the task, that can be turned off when not needed. Well, at least until we can run our shiny goodness on chocolatey goodness!