Future of computing crystal-balled by top chip boffins
Bad news: It's going to be tough. Good news: You won't be replaced
If you thought that the microprocessor's first 40 years were chock full of brain-boggling developments, just wait for the next 40 – that's the consensus of a quartet of Intel heavyweights, past and present, with whom we recently spoke.
At the 4004's 40th birthday party in a San Francisco watering hole on November 15, The Reg got an interesting earful from the leader of that first commercially available microprocessor's design team, Federico Faggin . At that same soirée, we also buttonholed Shekhar Borkar  of Intel Labs, where he directs microprocessor technology research.
Both men expressed confidence leavened with caution when describing the frontiers of microprocessor design and the future of computing.
Coupled with our earlier discussions  with Intel microprocessor architect Steve Pawlowski  and process technologist Mark Bohr , we came away with the impression that the past 40 years – especially the first 30 – weren't a simple and straightforward matter of process-size scaling as they may have appeared to those of us who were outside of the research labs.
The chip that helped start the computing revolution, in the hand of its lead designer
We also learned that that the next 10 years will be tough – and that after that, the crystal ball goes dark.
It ain't been no cakewalk
When we asked Borkar and Faggin about what might be the biggest challenges when developing future microprocessor technologies, Borkar pointed out that process improvements have been challenging since day one. "It's really tough. It has been tough."
Referring to the microprocessor's first 30 years, when process technology focussed on scaling using Robert Dennard's MOSFET scaling  guidelines as outlined in his landmark 1974 paper, Borkar said "It wasn't a breeze. Dennard scaling laid down the recipe – and it was a very good recipe. And people said, 'Let's follow it'. But it wasn't easy."
Times have changed, though – and not for the better. "Now," he said, "that recipe doesn't work."
Borkar then backtracked a bit, clarifying his statement to say that he believes that engineers will continue to advance scaling, but through more-selective choices of techniques. "It's not fair to say that we are at the end of Dennard scaling," he admitted.
"Dennard scaling showed a simple recipe. A well-behaved recipe," he said. "Now what we are doing is we are following that recipe, but we are doing it intelligently. 'Okay, I am going to scale length only', or 'I am going to scale the oxide, and not length'. We're doing it intelligently, as opposed to scaling everything down. So it is not fair to Dennard to say we are not following his recipe."
And Borkar has plenty of confidence in continued innovation. "The engineers," he said, "they'll find out a way to do it. There have been lots of innovations that have happened in process technologies – even in the 90s, when they were following Dennard scaling all the way through."
But nowadays, Borkar said, the challenges are getting tougher day by day. But again he put his faith in the ingenuity of engineers. "The one thing that gets the engineers going is a challenge. So when you give them this challenge: 'Make that 10 nanometer device work for me', they don't know better – they make it work for you," he said.
"That's what we did for the last 40 years."
Federico Faggin agreed that experimentation, failure, and innovation are what's needed to push the process forward. "That's life in the trenches," he said.
"I agree with Shekhar that engineers will figure out a way," he continued. "They will not figure out a way to use half an atom to do something, so there's going to be a limit on how small...", at which point Borkar interrupted him, joking: "Go and challenge them. That's my policy."
Coming soon to an Intel chip near you...
Borkar, Faggin, and Bohr all agree that it was about a decade ago when it became clear that pure Dennard scaling wasn't going to cut it.
As we pointed out when celebrating the 4004's 40th birthday  earlier this month, the first major process-technology innovation was strained silicon, which increased electron mobility while tamping down current leakage. Intel's first strained silicon processor was the "Prescott" Pentium 4 of 2004.
After strained silicon came high-k metal gate technology, which debuted with Intel's 45nm process in 2007. This advance added a better-insulating gate oxide and a metal gate to further reduce leakage and improve performance.
Next up is what Intel calls "3D" or "tri-gate" transistor technology, in which the channel doesn't lie flat, but instead sticks up into the gate, offering a larger channel surface area in a smaller geometry. Tri-gate transistors will be first used in Intel's 22nm "Ivy Bridge" chips, scheduled to ship next year.
But that increased surface is not the best thing about tri-gate, Bohr told us. "There is a benefit from an increased channel area, but it's not the big one," he said. "The big benefit is that as you are forming the transistor on a narrow, vertical silicon fin, or pillar, the electrostatics are improved – the gate electrode has better control of the channel area. The result is a device that's called 'fully depleted'."
If you're having trouble undersanding why that's important, the answer is really rather straightforward: the big advantage of a fully depleted transistor design is that it has a steeper sub-threshold slope – which means, essentially, that since the transistor's on-current versus off-current characteristics are improved, the transistor can have a higher on-current when it's switched on, improving performance, and a lower off-current when it's switched off, reducing power leakage.
Pawlowski – being a microarchitect and not a process technologist, has nothing but admiration for the engineers who create the silicon upon which his designs run. "The process guys are phenomenal. They really are," he told us. Referring to how the "process guys" manage to remain on a cadence of every 18 months to two years to bring out new process technologies, he said, "They're a machine."
Pawlowski doesn't see that cadence stumbling anytime soon. "I see them scaling to sub-10 nanometers well into 2020, 2022," he said. "And it's going to be our challenge of being able to use the transistors that are given us in a more efficient fashion."
That decade-or-so prediction is about right, thinks Borkar. "We're very confident that for the next 10 years, there is Moore's Law, there's no doubt about it," he said. "After that it's hazy.
Faggin extended Brokar's estimate to 20 years. "The mainstream for the next 10, 20 years is more of the same: faster processor, more cores, blah, blah, blah – all this stuff that we have been doing for the last 40 years," he said.
But the future is always hazy, Borkar says. "Ten years ago if you had asked me that, I would have said that."
But there are parts of that haze that are less hazy than others, Fagin says. "There are new things appearing at the horizon. First of all, we are beginning to come to the end of the road in terms of reducing the physical size of transistors. And so we have to find new ways of actually building these things – new materials to use, and so on."
Materials-man Bohr has some ideas as to what those materials might be – namely what are called III-V materials such as gallium arsinide and indium phosphide, which are given those Roman-numeral designations to indicate the number of their "valence electrons", the electrons in the outer shell of an atom that govern that element's interaction with other elements.
As Intel marches its process size down to 14nm, then 10nm, then 7nm, Bohr says, one area of investigation will be on using III-V materials to coat the silicon substrate. III-V materials will provide higher electron mobility, thus allowing transistors created in this way to be operated at lower voltages and lower leakages.
But Bohr emphasized to us that silicon will still be the core material in play. "Please don't have the mistaken impression that we will be changing from silicon wafers to gallium arsinide wafers," he said. "That's not what we will do. What we will do is stay with a silicon wafer, and then deposit some very thin layers on top of that wafer that are these special III-V materials."
Such a relatively straightforward coating process would have the added benefit of keeping incremental costs-per-chip low. "So it's not going to be a major cost increase," he said – but then added: "But it will increase somewhat the cost and complexity of what we do, but that's kind of name of the game for the past ten years."
Transistor structures might change, as well, as they have with Intel's tri-gate technology. But as to what those changes might be, no one was talking. "There are lots of thiings on the drawing board," Borkar said. "It is not very clear what the winner is. It's real early."
Bohr agreed. "We are clearly in an era now where we have to be almost continually changing, improving, inventing new materials and new transistor structures. I think you'll see more of that in coming generations."
Going extreme – maybe
But no matter what materials are used to increase electron mobility and reduce leakage, there remains the problem of using photolithography to etch the transistors into the silicon – whther it's coated with a III-V material or not.
As Borkar explained, "As far as the lithography is concerned, the limit is 190nm light, right?" The 90nm light he was referring to is in the ultraviolet range. The next step, however, is extreme ultraviolet lithography, known in the trade as EUV – which is around 13-13.5nm.
"The next is 13, and there is nothing else in sight," Borkar said.
But EUV – the next Holy Grail of chipbaking technology – is proving elusive. "EUV has several challenges," Bohr told us. One of those challenges is to create the reflective masks needed for EUV. "The other challenge," Bohr said, "is coming up with a high-intensity light source that has enough photons to expose the photoresist that you want – enough photons at the appropriate wavelengths, at the EUV wavelengths."
When we asked Bohr about the high power-consumption levels of EUV technology – a difficulty noted by others – he said "I'm not sure that's such a big deal. The industry is trying to develop a high-intensity light source, and you might measure that in power – how many watts of output does it have – and trying to achieve higher power levels is what we're actually trying to do. Just because the machine consumes 100 watts or 200 watts or whatever, I don't think that's a major problem."
There does remains one other big problem, he said. "It's getting enough intensity so you can expose the wafer quickly and get good throughput from the machine" – which is what Simon Segars of ARM's Physical IP Division was talking about at this year's Hot Chips conference when he said , "To have a fab running economically, you need to build about two to three hundred wafers an hour. EUV machines today can do about five."
As process dimensions drop to 20nm and below, 190nm lithography is resorting to doubling up the light-guiding masks. When we asked Borkar if it would be possible to etch even smaller features with multiple masks, he said it would be possible, but expensive. "Hey, it's an act of desperation, but you gotta continue this going," he said. "We'll try hard – there is no stone that's not unturned. You've got to dig them up."
But should EUV become an affordable reality – and that remains a big "but" – it'll be clear sailing for a while, at least in terms of lithography. "Just imagine," Borkar said, "now with 190, 191, 192 nanometer light, I'm going down to 32 nanometers. In a breeze. So with 13 nanometers I can go down to a couple of nanometers, right?"
When we reminded him that at a "couple of nanometers" process size, he would be, as Bohr likes to say, "running out of atoms," Borkar just smiled. "I'll let the next generation figure that one out," he said.
As might be guessed, microprocessor architect Pawlowski was more interested in having the "process guys" hand him highly efficient chips than in exactly what materials wizardry they might use to stay on that ever-shrinking ramp.
Computational efficiency is on Pawlowski's mind, and it's core to the exascale initiative that Intel is conducting with the US and foreign governments. He even suggested that we do a bit of background reading on that topic: an article  in the July/September issue of the IEEE Annals of the History of Computing by Stanford University's Jonathan Koomey  and others entitled "Implications of Historical Trends in the Electrical Efficiency of Computing".
"What they're looking at is computational efficiency and how that's evolved since 1946 up to 2009," he said about the article. "And they've basically shown that it's followed a Moore's law, in that the improvement of flops-per-watt has basically doubled every 1.5 to 1.6 years. We've actually seen that before, but these guys formulated it and put it in a paper."
Following that trendline to its logical conclusion, Pawlowski said, "We went to the US government and said 'By the time you get to exascale, these are going to be 150-watt machines – is that what you want?' And they said, 'No, we need to have something at about 20 watts'."
Problem. To reach that low-power requirement would require architectures that use only about three picojoules per flop (floating point operation). A picojoule, by the way, is one million-millionth of a joule (10-12; if you're sitting quietly at rest, you emanate about 100 joules of heat energy every second.
Translation: a picojoule is a very, very small amount of energy.
Even if those aforementioned process guys can continue their scaling and materials-development successes for the forseeable future, once the exascale level is reached, Pawlowski says, "Our current Core architecture is maybe about at 16 picojoules per flop – and now we've gotta get about 5 or 6X better than that."
When we asked Pawlowski what technologies and techniques he was looking at to get that large of an increase in computational efficiency, he had a few ideas – but few he could share with us. "There's not really too much I should tell you," he chuckled. "But the bottom line is we're tearing apart the applications. So instead of just building the machine and then saying, 'Okay, here, programmer, go write it', we're actually looking at the applications for the different workloads."
Is IA headed for the boneyard?
Our next question was obvious: would changing applications mean changing the IA architecture? (Okay, "IA architecture" is an unnecessarily redundant and repetitive repetition, but you know what we mean...) Pawlowski firmly rejected that idea. "No. IA will still be there. We'll still be executing x86 instructions."
Which makes sense, seeing as how x86 processors have been decoding IA instructions into micro-operations – or µops – since the days of Intel's P6 architecture, which first appeared in the Pentium Pro.
"You see, that's the beauty of an architecture like the P6," Pawlowski said, "because the instruction comes into the front end, but what's put out the back end and executed on the machine itself is something completely different. But it still runs the program, even though the micro-instruction may not look anything like the macro-instruction, the IA instruction."
From architect Pawlowski's point of view, however, the days of microprocesser architects saying to process technologists "you build those tiny transistors and we'll figure out how to use them" are drawing to an end.
"The two groups, microarchitects and process-technology techs," he told us, "have mostly worked in their own worlds. As the process guys are coming up with a new process, they'll come and say, 'Okay, what kind of circuits do you need? What do you think you're going to look at?' And so we'll tell them, and they'll go off and they'll develop the device and give us device models, and we'll be able to build the part."
He also implied that process techs have saved the archtects' hiney's more than once. "I'm kind of embarrassed to say this," he said, "but the device guys have come through many times. If there was something that we needed, they were able through process to make whatever tweaks they needed to make to let the architecture continue the way it was."
Those last-minute saves may no longer be possible in the future. "Now, I think, the process is going to be more involved in defining what our architecture is going to look like as we go forward – I'm talking by the end of the decade. We're on a pretty good ramp where we are, but as I look how things are going to change by the end of the decade, there's going to be a more symbiotic relationship between the process engineers and the architects," Pawlowski said.
"The process guys and the architecture guys are going to be sitting together."
Your brain is safe – for now
All this increasing computational efficiency, decreasing power requirements, and increasing chip complexity begs one important question: what's all this computational power going to be used for?
Addressing that question at the 40th birthday party of the Intel 4004, Federico Faggin was quick to dismiss the idea – for the foreseeable future, at least – that computational machines would be able to mimic the power of the human brain.
"There is an area that is emerging which is called cognitive computing," he said, defining that area of study as being an effort to create computers that work in a fashion similar to the processes used by the human brain.
"Except there is a problem," Faggin said. "We do not know how the brain works. So it's very hard to copy something if you don't know what it is."
He dismissed research in this area. "There is a lot of research," he said, "and every ten years, people will say 'Oh, now we can do it'."
He also included himself among those researchers. "I was one who did the early neural networks in one of my companies, Synaptics ," he said, "and I studied neuroscience for five years to figure out how the brain works, and I'll tell you: we are far, far away from understanding how the brain works."
A goal far beyond building a computing system that could mimic processes used by the human brain, would be embuing a computer with intelligence – and Faggin wasn't keen on that possiblity, either.
"Can they ever achieve intelligence and the ability of the human brain?" he asked, referring to efforts by the research community. "'Ever' is a strong word, so I won't use the word 'ever', but certainly I would say that the human brain is so much more powerful, so much more powerful, than a computer."
Aside from mere synapses, neural networks, and other brain functions, Faggin said, there are processes that we simply haven't even begun to understand. "For example," he said, "let's talk about consciousness. We are conscious. We are aware of what we do. We are aware of experiences. Our experience is a lived thing. A computer is a zombie. A computer is basically an idiot savant ."
He also reminded his audience that it is man who built the computer, and not the other way around. "[A computer's] intelligence is the intelligence of the programmer who programmed it. It is still not an independent, autonomous entity like we are," he said.
"The computer is a tool – and as such, is a wonderful tool, and we should learn to use it ever more effectively, efficiently. But it's not going to be a competition to our brain."
The purpose of the computer, said the man who helped bring them into so many aspects of our lives, is to help us be "better human beings in terms of the things that matter to a human, which are creativity, loving, and living a life that is meaningful."
After we whipped together our "Happy 40th birthday, Intel 4004! " article earlier this month, we were handed a copy of the November 15, 1971 Electronics News advert that launched the Intel 4004. We though you might want to take a gander:
Great for 'data terminals, billing machines, measuring systems, and numerical control systems' (click to enlarge)