Happy 40th birthday, Intel 4004!
The first of the bricks that built the IT world
Feeling the strain
But before it did, there was work to be done on process technology, and the introduction of the first of the three major post-scaling technologies that Mark Bohr talked about: strained silicon.
In a highly simplified nutshell, strained silicon involves the material being stretched – or strained – in such a way as to pull the individual silicon atoms apart from one another. Doing so frees up the electrons and holes in the material, increasing their mobility substantially, thus allowing for lower-power transistor designs.
Although strained silicon had been under investigation at MIT and elsewhere, the early techniques were was biaxial – that is, the entire silicon lattice was stretched. Intel's breakthrough was the development of uniaxial stretching. Biaxial straining was good for nMOS but bad for pMOS, both of which need to be balanced for good transistor performance.
Biaxial straining also had problems with source drain and defects, Bohr told us – "not a very manufacturable technology". The uniaxial approach, however, could be applied "just to the pMOS device," Bohr said, "and it didn't have any significant yield issues, so it turned out to be both a high-performance solution and a good manufacturing solution."
But back to the departure and then the return of P6.
The follow-on architecture to P6 was NetBurst, and it was not exactly Intel's finest hour. By the time P6 had evolved into the Pentium III, its pipeline was just 10 stages long; NetBurst doubled that to 20 stages in the Willamette Pentium 4 in 2000, and increased that "Hyper Pipelined Technology" to 31 stages in the Prescott Pentium 4 in 2004 – which, by the way, was the first processor to use Bohr's 90nm strained silicon process technology.
According to Pawlowski, the reason for the deeper pipeline was "frequency, frequency, frequency". In a bit – well, more than a bit – of an oversimplification, deep pipelines require higher frequencies to achieve the same performance as architectures with shorter pipelines.