Nvidia forges ARM chip for PCs and servers
Rumors true – except for the x64 bit
CES 2011 The ARM race for the data center just got a whole lot more interesting. At the Consumer Electronics Show in Las Vegas, Nvidia announced that it is indeed working on a CPU and that the chip is based on the ARM RISC architecture that's wickedly popular in smartphones and tablets. In other words, it's not a low-powered x64 chip.
In the funny way that the world works, Intel, which owns the processor market for PCs and servers, was probably of two minds about the rumors of graphics chip maker Nvidia creating an X64 clone and taking on the chip giant in the CPU space.
Intel didn't need the competition, with Advanced Micro Devices already in there, but another x64 competitor is easy enough for Intel to defend against. With Microsoft also announcing today at CES that Windows will run on ARM chips and with Google's Android Linux, Apple's iOS, and Canonical's Ubuntu already on the low-powered processors, that's a big portion of modern computing.
Throw in Red Hat Enterprise Linux and its clones from CentOS and Oracle, and you can pretty much call it a day in terms of OS coverage for ARM. Alas, today was not Intel's lucky day, no matter how much revenues and profits it stands it get in the near-term from its new "Sandy Bridge" Core PC chips and future Xeon variants for workstations and servers. With Nvidia and Microsoft jumping into the ARM race, the future for Intel is going to get a whole lot tougher.
After going on and on for most of an hour about "super phones" based on Nvidia's Tegra 2 system-on-a-chip, which has two ARM Cortex-A9 cores and which combines processing capability with HD graphics, Jen-Hsun Huang, president and chief executive officer at Nvidia, announced in his keynote address that Nvidia was indeed working on a processor that would eventually be integrated with its graphics co-processors. The ARM-happy effort is known Known as "Project Denver".
Nvidia has also licensed the future Cortex-A15 design from ARM Holdings, the company behind the ARM design, joining Qualcomm, Texas Instruments, Calxeda, and many others who are also goosing the ARM chip to grow it from its mobile phone niche to more general purpose computing. All of these companies believe that the technical ecosystem and user base for smartphones larger and the ARM architecture is more energy efficient than that of the x64 architecture. However, the ARM architecture has its own issues, with a 32-bit memory addressing and a future 40-bit extended memory still in the works. Many believe that to be useful in the server space, ARM chips need to get to true 64-bit addressing. We'll see.
"The energy around the ARM architecture is absolutely enormous," said Huang in his keynote. "There is no question now that in two or three years time - overnight practically in the technology business - that ARM is the new ISA, the new standard."
Then Huang flashed up this chart to make his point:
Huang said that Project Denver was one of the most important announcements that Nvidia has made in its history, and he said that a team of engineers with expertise with ARM, Sparc, x64, Power, and MIPS chips has been hard at work for years on figuring out what future smartphones, tablets, PCs, and servers might need in terms of CPUs and GPUs. "I think this is a game changer," Huang said.
Nvidia is not providing much in the way of detail about Project Denver, but Andy Keane, general manager of Tesla supercomputing at Nvidia, told El Reg that Nvidia was slated to deliver its Denver cores concurrent with the "Maxwell" series of GPUs, which are due in 2013. As we previously reported, Nvidia's "Kepler family of GPUs, implemented in 28 nanometer processes, are due this year, delivering somewhere between three and four times the gigaflops per watt of the current "Fermi" generation of GPUs. The Maxwell GPUs are expected to offer about 16 times the gigaflops per watt of the Fermi. (Nvidia has not said what wafer baking process will be used for the Maxwells, but everyone is guessing either 22 or 20 nanometers).
While Keane would not say how many ARM cores would be bundled on the Maxwell GPUs, he did confirm that Nvidia would be putting a multicore chip on the GPUs and hinted that it would be considerably more than the two cores used on the Tegra 2 SoCs. "We are going to choose the number of cores that are right for the application," says Keane.
One thing that Nvidia is not going to do is create a standalone CPU that fits in a processor socket and gets sold on a cheap server. The Nvidia plan is to bake a CPU-GPU hybrid and sell it into server and PC markets where the combination of serial/parallel and massively parallel processing make sense. Nvidia is not just shooting for the supercomputing and data analytics markets, but it believes that future Web infrastructure and general-purpose computing will be able to make full use of the ceepie-geepie hybrid chips.
Keane was also mum on what kind of memory addressing the Denver cores would have. "It is a logical question, and we will do the logical thing," he said with a laugh.
Bill Dally, Nvidia's chief scientist and vice president of research and formerly the chairman of the computer science department at Stanford University, took jabs at the x64 architecture in a blog about the announcement of the Denver ceepie-geepie effort.
"Denver frees PCs, workstations and servers from the hegemony and inefficiency of the x86 architecture," Dally wrote. "For several years, makers of high-end computing platforms have had no choice about instruction-set architecture. The only option was the x86 instruction set with variable-length instructions, a small register set, and other features that interfered with modern compiler optimizations, required a larger area for instruction decoding, and substantially reduced energy efficiency.
"Denver provides a choice. System builders can now choose a high-performance processor based on a RISC instruction set with modern features such as fixed-width instructions, predication, and a large general register file. These features enable advanced compiler techniques and simplify implementation, ultimately leading to higher performance and a more energy-efficient processor."
So now the lines are drawn on the battlefield. Intel and AMD are armor-plating their x64 CPUs with trimmed down GPUs in a single chip package, and Nvidia is plating its GPUs with skinny ARM cores. It will be interesting to see what future workloads require and which approach is best suited for data center, desktop, and handheld needs. The good news is, at least we have a fight coming, and that means the end users win. ®
Obviously they would do their own thing, but it would seem to me to make sense to build a PCI-Express card like Fermi but which wasn't dependent on a host processor because it had its own cores. Then you could have a backplane-based computer like some industrial PCs have had for decades. You could then build a commodity HPC cluster which scaled along the PCI-bus without special hardware. For more than the PCI-E bus could handle you just attach the interconnect of your choice to the PCI-E bus and link multiple backplanes into one giant commodity HPC.
Then start playing Crisis on your 10 CPU-GPU hybrid system.
Sorry, "x64" isalready taken
Could you kindly stop referring to x86-64 as x64? It's confusing as hell. Originally x86 was short for the 80x86 series of processors. And x64 would be the abbreviation for Digital/Compaq/HP's 21x64 series of processors—i.e. the Alpha architecture.
It may have been murdered but it hasn't been forgotten.
Rockin' with ARM
It's taken them 25 years, but they finally have US Big Silicon on the run. Good show, chaps!
Who's building ARM with HyperTransport?
@Bob H: The snag with passive backplane stuff (as indeed used in industrial/telecoms boxes for years) is it costs more, which isn't usually welcome in consumer stuff unless it brings a saleable benefit. Reliability and availability are not usually saleable benefits in the consumer market (Windows doesn't have them, does it :)).
On the other hand, if someone were to build an ARM SoC with a built in interface to (say) Hypertransport, and someone (maybe the same someone, maybe not) were to build the necessary high speed PC-style IO etc based around Hypertransport as interconnect... would anyone be interested in that, would it be cost competitive, would it have saleable benefits?
The destruction of a mighty empire
"In the funny way that the world works, Intel, which owns the processor market for PCs and servers, was probably of two minds about the rumors of graphics chip maker Nvidia creating an X64 clone and taking on the chip giant in the CPU space."
I thought the rumour was that Intel used its x86-related patents to lock nVidia out of the market? Certainly I don't think that nVidia would decide to cold-shoulder x86 altogether and instead take the big risk of trying to sell a rival arch to Windows users if it could instead take the easy option and license the x86 patents on acceptable terms. And now it's watching its business of selling GPUs for Intel and AMD PCs dry up even more as Intel pushes Sandy Bridge graphics. So Intel brought this on itself, something it may, or may not, come to regret. Either way it's going to be pretty dramatic - unless Intel relents and lets nVidia into the x86 tent, then either nVidia or the Wintel hegemony is about to have a great fall.