Feeds

Pentium 4 dissected

Gory details of inner workings revealed

  • alert
  • submit to reddit

What do the 42 million transistors on a Pentium 4 actually do?

Willamette, aka Pentium 4 is the first new processor Intel has launched since the Pentium Pro. Sure, there's been Pentium II, Pentium III, Celeron and Xeon, but these all use the P6 microarchitecture introduced with the Ppro.

The problem P6 has is that, due to the pipelining it uses, it's subject to an absolute speed limit, which at a 0.18 micron process, equates to around 1.2GHz. Try to run it any faster than that and it just gets hotter rather than doing any more useful work.

The problems Chipzilla encountered with the 1.13GHz PIII are testament to the fact that the PIII is perilously close to its absolute speed limit.

P4 is entirely new and uses the tragically-trademarked NetBurst architecture with hyper-pipelined technology - twice the length of the P6 pipeline which significantly increases frequency scalability.

The downside is that the length of the pipe means fewer instructions per clock tick can be executed compared with a PIII (or Athlon). So at comparable clock speeds, a PIII or Athlon can be seen to outperform a P4.

This is an anomaly that will disappear as P4 moves ever onward and upward to clock speeds physically unattainable to the older architectures.

P4's rapid execution engine isn't something introduced by Dubya Bush to reduce the backlog of people on death row in Texas prisons, but a mechanism which runs the processor's arithmetic logic units at twice the core frequency of the rest of the chip.

Screaming Sindy gets more extensions

The Pentium 4 also has improved dynamic execution to more accurately predict branch utilisation. An execution trace cache stores D-coded instructions, which removes the decoder from the main instruction loop. The P4 also supports 144 new streaming SIMD Extension 2 instructions, with double precision floating point, 128-bit SIMD integer, and improved cache and memory management instructions.

The i850 (Tehama) chipset supports dual channel Rambus memory at an effective 400Mb FSB speed with a throughput of 3.2Gb/sec, while AGP 4X graphics run at over 1GB/sec - twice as fast as AGP 2X.

The move to 0.13 micron in the second half of next year also sees Intel moving to copper interconnects for the first time. Alongside this, a move to 300mm wafers will reduce production costs. Intel claims the change from aluminium to copper will produce a speed increase of around 65 per cent, while using less power and generating less heat. The smaller die size alone will reduce costs by around 30 per cent.

By 2003, Intel plans to have five fabs producing 0.13 micron 200mm wafers and three at 300mm. ®

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
Win a year’s supply of chocolate
There is no techie angle to this competition so we're not going to pretend there is, but everyone loves chocolate so who cares.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Intelligent flash storage arrays
Tegile Intelligent Storage Arrays with IntelliFlash helps IT boost storage utilization and effciency while delivering unmatched storage savings and performance.