Nvidia forges ARM chip for PCs and servers

Rumors true – except for the x64 bit

arrow pointing up

CES 2011 The ARM race for the data center just got a whole lot more interesting. At the Consumer Electronics Show in Las Vegas, Nvidia announced that it is indeed working on a CPU and that the chip is based on the ARM RISC architecture that's wickedly popular in smartphones and tablets. In other words, it's not a low-powered x64 chip.

In the funny way that the world works, Intel, which owns the processor market for PCs and servers, was probably of two minds about the rumors of graphics chip maker Nvidia creating an X64 clone and taking on the chip giant in the CPU space.

Intel didn't need the competition, with Advanced Micro Devices already in there, but another x64 competitor is easy enough for Intel to defend against. With Microsoft also announcing today at CES that Windows will run on ARM chips and with Google's Android Linux, Apple's iOS, and Canonical's Ubuntu already on the low-powered processors, that's a big portion of modern computing.

Throw in Red Hat Enterprise Linux and its clones from CentOS and Oracle, and you can pretty much call it a day in terms of OS coverage for ARM. Alas, today was not Intel's lucky day, no matter how much revenues and profits it stands it get in the near-term from its new "Sandy Bridge" Core PC chips and future Xeon variants for workstations and servers. With Nvidia and Microsoft jumping into the ARM race, the future for Intel is going to get a whole lot tougher.

After going on and on for most of an hour about "super phones" based on Nvidia's Tegra 2 system-on-a-chip, which has two ARM Cortex-A9 cores and which combines processing capability with HD graphics, Jen-Hsun Huang, president and chief executive officer at Nvidia, announced in his keynote address that Nvidia was indeed working on a processor that would eventually be integrated with its graphics co-processors. The ARM-happy effort is known Known as "Project Denver".

Nvidia has also licensed the future Cortex-A15 design from ARM Holdings, the company behind the ARM design, joining Qualcomm, Texas Instruments, Calxeda, and many others who are also goosing the ARM chip to grow it from its mobile phone niche to more general purpose computing. All of these companies believe that the technical ecosystem and user base for smartphones larger and the ARM architecture is more energy efficient than that of the x64 architecture. However, the ARM architecture has its own issues, with a 32-bit memory addressing and a future 40-bit extended memory still in the works. Many believe that to be useful in the server space, ARM chips need to get to true 64-bit addressing. We'll see.

"The energy around the ARM architecture is absolutely enormous," said Huang in his keynote. "There is no question now that in two or three years time - overnight practically in the technology business - that ARM is the new ISA, the new standard."

Then Huang flashed up this chart to make his point:

ARM versus X64 shipments

Huang said that Project Denver was one of the most important announcements that Nvidia has made in its history, and he said that a team of engineers with expertise with ARM, Sparc, x64, Power, and MIPS chips has been hard at work for years on figuring out what future smartphones, tablets, PCs, and servers might need in terms of CPUs and GPUs. "I think this is a game changer," Huang said.

Nvidia is not providing much in the way of detail about Project Denver, but Andy Keane, general manager of Tesla supercomputing at Nvidia, told El Reg that Nvidia was slated to deliver its Denver cores concurrent with the "Maxwell" series of GPUs, which are due in 2013. As we previously reported, Nvidia's "Kepler family of GPUs, implemented in 28 nanometer processes, are due this year, delivering somewhere between three and four times the gigaflops per watt of the current "Fermi" generation of GPUs. The Maxwell GPUs are expected to offer about 16 times the gigaflops per watt of the Fermi. (Nvidia has not said what wafer baking process will be used for the Maxwells, but everyone is guessing either 22 or 20 nanometers).

While Keane would not say how many ARM cores would be bundled on the Maxwell GPUs, he did confirm that Nvidia would be putting a multicore chip on the GPUs and hinted that it would be considerably more than the two cores used on the Tegra 2 SoCs. "We are going to choose the number of cores that are right for the application," says Keane.

One thing that Nvidia is not going to do is create a standalone CPU that fits in a processor socket and gets sold on a cheap server. The Nvidia plan is to bake a CPU-GPU hybrid and sell it into server and PC markets where the combination of serial/parallel and massively parallel processing make sense. Nvidia is not just shooting for the supercomputing and data analytics markets, but it believes that future Web infrastructure and general-purpose computing will be able to make full use of the ceepie-geepie hybrid chips.

Keane was also mum on what kind of memory addressing the Denver cores would have. "It is a logical question, and we will do the logical thing," he said with a laugh.

Bill Dally, Nvidia's chief scientist and vice president of research and formerly the chairman of the computer science department at Stanford University, took jabs at the x64 architecture in a blog about the announcement of the Denver ceepie-geepie effort.

"Denver frees PCs, workstations and servers from the hegemony and inefficiency of the x86 architecture," Dally wrote. "For several years, makers of high-end computing platforms have had no choice about instruction-set architecture. The only option was the x86 instruction set with variable-length instructions, a small register set, and other features that interfered with modern compiler optimizations, required a larger area for instruction decoding, and substantially reduced energy efficiency.

"Denver provides a choice. System builders can now choose a high-performance processor based on a RISC instruction set with modern features such as fixed-width instructions, predication, and a large general register file. These features enable advanced compiler techniques and simplify implementation, ultimately leading to higher performance and a more energy-efficient processor."

So now the lines are drawn on the battlefield. Intel and AMD are armor-plating their x64 CPUs with trimmed down GPUs in a single chip package, and Nvidia is plating its GPUs with skinny ARM cores. It will be interesting to see what future workloads require and which approach is best suited for data center, desktop, and handheld needs. The good news is, at least we have a fight coming, and that means the end users win. ®

Sponsored: 10 ways wire data helps conquer IT complexity