Transmeta blades power landmark supercomputer breakthrough
The home of the atom bomb gave an extraordinary, and unexpected endorsement of Transmeta's low power chip Crusoe - and the ultra-dense blades pioneered RLX Technologies - last week, InfoWorld reports. It may even have come up with an answer to the Gelsinger Coefficient.
Los Alamos Labs in New Mexico has built a modest 240-node Beowulf cluster from RLX's Crusoe-powered blades, which is humble indeed compared to the vast ASCI showpieces. But according to the system's designer, Wu-chun Feng, it marks an inflection point for the microprocessor industry. And to prove the point VAX godfather Gordon Bell, and Linus Torvalds were present at the birth.
The cluster, entitled Green Destiny, holds its own against x86 clusters of similar horsepower. Only it sits in a hot warehouse and doesn't require additional cooling. Feng reckons this adds up to only a third of the total cost of ownership of an equivalent Intel-powered system, once you include floorspace and energy requirements.
Feng expounds at length in this very readable design paper, although we're going to quote liberally from it here.
Feng doesn't advocate Transmeta blades as an overnight, drop-in replacement for traditional supercomputing clusters, even ones based on commodity Intel processors. But he does say that the power and heat dissipation requirements of Intel's x86 designs make low-power approaches such as the Crusoe blades far more viable in the long term.
Long time Register readers will recall this acknowledgement, because we ran a little competition to mark the occasion: see this succession of stories - see Intel touts Alpha, IBM designs to beat 'hotter than reactor'chips, Chip designers vow to cool overheating Gelsinger, and British, American scientists discover Gelsinger co-efficient for details of previous research work in this area.
But the nuke experts may have avoided a future meltdown, as Feng explains.
Microprocessor design bets that lower voltages and instruction efficiencies will temper the trend for ever more dense, power-guzzling CPUs. But the theory isn't working:
"Lowering the voltage by half drops the power dissipation to one-quarter of its former value; however, lowering the operating voltage also lowers the maximum operating frequency, thus decreasing performance. This vicious cycle will continue throughout this decade and become more and more difficult to deal with," he writes.
"So, what is the solution to this dilemma? Quit using the 'increasing clock frequency = increasing performance' marketing ploy. We believe that the ideal clocking frequency will be in the 500-MHz to 1-GHz range and that to improve performance, the microprocessor must do more work on a percycle basis, e.g., see the results for the IBM Power3...
"By keeping both the voltage and frequency low, power dissipation is also kept low. A detailed discussion about this controversial statement is beyond the scope of this paper." he adds.
And controversial it certainly is, as any Macintosh, Cray or indeed, any RISC, user will tell you, Megahertz doesn't equate to speed.
The latest Crusoe Blade cluster, says Feng, has a tenth of the performance per-watt than ASCI Red.
"Performance has increased by a factor of 2000 since the Cray C90, performance per square foot has only grown by a factor of 65; thus, while supercomputers are getting faster, they are making less and less efficient use of the space that they occupy, and space is money. As the thermal power dissipation continues to increase linearly, the performance/space ratio is going in the opposite direction."
That's a problem, even in the vast expanses of New Mexico he argues. Someone's got to pay for it. Feng argues that instead of traditional flops, a new metric involving total cost of ownership, one that includes cooling costs and floorspace, should be used.
Feng doesn't suggest that Crusoe blades are going to become the parallel computing commodity overnight:
"Bladed Beowulf is not meant to be a replacement for these other types of architectures (at least, not yet)…. When it comes down to raw performance, the Bladed Beowulf simply cannot compete with ASCI-style supercomputers due to their massive compute and communication capabilities…" he writes.
But it is a trend to look out for.
"The continued tracking of Moore’s law will result in the microprocessor of 2010 having over one billion transistors and dissipating over one kilowatt of thermal energy; this is considerably more energy per square centimeter than even a nuclear reactor."
And the nuclear guys have enough work to do tracking the stockpiles, without than tending a home-grown PC bomb of their own. But at AMD and Intel, is anyone listening?
By the way, Gelsinger's extrapolations are based on current x86 design. Where this leaves the power-sapping IA-64 Hotinium is anyone's guess. Right now a cluster of Itanics requires a corresponding cluster of human-powered bicycles attached to fans, which probably only the population of China can man. ®
If they're blades, we have them sheathed… (we hope)
F5 to deliver crucial blade platform
Sun waves blade hand
Sun waves two (x86, SPARC embedded) blade hands
Intel crashes blade party
RLX co-opts Intel
HP blades spared the axe?
HP's Blade strategy isn't so dense
Dell flashes blades
Stop laughing. Even Compaq does a blade too