Feeds

Nvidia CEO outs GPU roadmap

'Perf per watt equals perf'

Gartner critical capabilities for enterprise endpoint backup

GTC Nvidia president and CEO Jen-Hsun Huang has outed the company's GPU roadmap.

"For the very first time in the history of our company," Huang during his Tuesday keynote at the company's GPU Technical Conference in San Jose, California, "we are going to tell you the code names and the progression of our next several generations of processors."

Although Nvidia's senior Tesla product manager Sumit Gupta told The Reg earlier that his company wouldn't be outlining its product roadmap at the conference, he was merely trying not to one-up his boss.

Code names and dates, however, are pretty much all that Huang revealed, aside from relative performance plans. Deep details of the next two generations — "Kepler", due next year; and "Maxwell", due in 2013 — will have to wait until a later date.

Still, what Huang did announce will be of interest to the growing CUDA cadre or, for that matter, to anyone interested in the future of what Huang described as "parallel computing, GPU computing, accelerated computing, heterogeneous computing — however you guys want to describe it."

In describing the performance of Kepler and Maxwell, Huang didn't talk mere gigaflops. Instead, he used the metric of dual-precision flops per watt, with Nvidia's Tesla GPU as the baseline. His reasoning basically boiled down to the fact that: "In a constant power-delivery environment ... 'perf per watt' equals 'perf'."

"In the case of performance, Fermi is about one and a half double-precision gigaflops per watt," Huang said. "We expect Kepler to be somewhere between three to four times the performance per watt of Fermi."

Kepler's performance-per-watt advance is well underway, according to Huang. "Kepler is based on 28 nanometers. It's scheduled for production later next year. The design is progressing very rapidly. There are hundreds of engineers working on it. By the time we are done with the Kepler family, we will probably have invested a couple of billion dollars on R&D for it."

In 2013, "We'll have Maxwell," he said. "Maxwell is going to bring with it yet again a big step up in performance."

If Tesla is regraded as a baseline, Huang said, "Maxwell is going to be 16 times relative to [Tesla] in just a couple more years. "In just a few more years, we're going to see a sixteen-times improvement in performance for your parallel-computing applications."

Huang also emphasized that Nvidia's future is, in his view, more than a mere matter of gigaflops per watt. "Now performance isn't the only thing that we will bring. Each generation will bring architectural ideas and features — like we did this time," he said. "This time we brought you ECC so that we can deploy, in a large-scale way, GPUs in server groups."

His reasoning behind the importance of ECC was simplicity itself: "If you need to run a simulation for a week, the last thing that you want to know is that somehow one of the GPUs created the wrong answer along the way."

His company has more non-gigaflopian improvements on the way, Huang said: "So all the way between now and the Maxwell time frame, we're introducing new features that you've been asking for — things like preemption, things like virtual memory. We're also going to continuously enhance the GPU's ability to autonomously process, so that it's non-blocking of the CPU, not waiting for the CPU, relies less on the transfer overheads that we see today."

Huang also promised that Nvidia's roadmap will bring "a very large speedup in performance." Remember, though, that performance in Huang-speak isn't defined by gigaflops alone. "'Perf per watt'," after all, "equals 'perf'." ®

Secure remote control for conventional and virtual desktops

More from The Register

next story
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Microsoft: Azure isn't ready for biz-critical apps … yet
Microsoft will move its own IT to the cloud to avoid $200m server bill
Oracle reveals 32-core, 10 BEEELLION-transistor SPARC M7
New chip scales to 1024 cores, 8192 threads 64 TB RAM, at speeds over 3.6GHz
US regulators OK sale of IBM's x86 server biz to Lenovo
Now all that remains is for gov't offices to ban the boxes
Object storage bods Exablox: RAID is dead, baby. RAID is dead
Bring your own disks to its object appliances
Nimble's latest mutants GORGE themselves on unlucky forerunners
Crossing Sandy Bridges without stopping for breath
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Top 10 endpoint backup mistakes
Avoid the ten endpoint backup mistakes to ensure that your critical corporate data is protected and end user productivity is improved.
Top 8 considerations to enable and simplify mobility
In this whitepaper learn how to successfully add mobile capabilities simply and cost effectively.
Rethinking backup and recovery in the modern data center
Combining intelligence, operational analytics, and automation to enable efficient, data-driven IT organizations using the HP ABR approach.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.