Pentium 4: performance puzzle begins

Will it, won't it be a goer?

At a technical track in San Jose today, the principal architect of the Pentium 4, Doug Carmeon, briefed delegates on the performance of the up-and-coming processor.

But although Carmeon gave many comparisons between a 1.4GHz Pentium 4 and a 1GHz Pentium III, the information from his talk was hard to interpret.

First off, the die size for the Pentium 4 has increased from the projected 170mm to 217mm, meaning that the processor may cost more than Intel or the world originally anticipated.

Carmeon began by saying the key goal of his team was to deliver "world acclaimed performance" with headroom for the futre.

He outlined much of what is now generally known, including its 400MHz system bus, its advanced dynamic execution, its rapid execution engine, its advanced transfer cache, and its hyper pipelined technology. Not to forget, of course, SSE2 (Screaming Sindy 2), with its 144 new instructions.

He said performance on a microprocessor is determined by frequency times instructions per cycle. Frequency, he said, is typically limited by process technology.

The Pentium III architecture has a ten-stage pipeline, and the Pentium 4 a 20-stage pipeline. Long pipelines, he said, made for better branch prediction. Intel has developed its own algorithm which he claimed is better than any publicly known prediction.

The advanced dynamic execution in the Pentium 4 made for "very deep speculated execution", three times that of the Pentium III, with 48 loads and 24 stores making it three and two times that of the older chip.

The arithmetic logical unit (ALU) latency was better, the Pentium III delivering one nanosecond at one clock at 1GHz, with the Pentium 4 at .3 nanoseconds.

L1 cache is two times faster than for the Pentium III, which has three clocks at 1GHz, equal to three nanoseconds, while the P4 manages two clocks at 1.45GHz.

L2 cache, that is the advanced transfer cache, uses 128 byte lines, and holds both data and instructions. An average cache speed on the newer chip is 1.8 better for average desktop applications.

Carmeon concluded his presentation by showing a chart containing these comparisons and claimed: "This is the world's highest performance desktop microprocessor." He did say that applications which were optimised for the new SSE2 instruction set would typically show the most performance boost in real life.

No benchmark details were given of the one against the other, and we were left with the overall somewhat feeling that without this, it would be hard to judge just how the Pentium 4, which requires a 40W power supply, matches up to these claims.

A subsequent roundtable meeting did little to clear up the matter. Intel showed the latest stepping of the die and was repeatedly pressed on performance and benchmarking issues, finally giving one benchmark for Media from the Sysmark 2000 suite, which showed that a 1.4GHz P4 performed 1.5 times better than a 1GHz PIII.

Later in the day, Pat Gelsinger, Intel's CTO, said that the ramp of the Pentium 4 would increase over its first nine months, with a typical system costing around $2500.

During the weeks before the part is unleashed on the world, Intel will release more benchmarks. It will be interesting to see just how much faster the beast will be to justify the additional price.

While this is mere speculation, conversation with colleagues produced a consensus that the headroom, in raw megahurts terms, may allow Intel to extend the clock speed to 5GHz or 6GHz over the next 18 months. We shall return to this topic over the weekend.

A direct comparison between each running at 1GHz would be especially revealing right now, but that's too much to hope for from the tight lipped microprocessor community. ®

SeeRelated Stories

Hard facts emerge about Willamette (27 April 1999)
Willamette will outperform K7 by 2x (15 April 1999)

Biting the hand that feeds IT © 1998–2019