Feeds

AMD pins exascale vision on Fusion APUs

The rebirth of vector processing

HP ProLiant Gen8: Integrated lifecycle automation

Because Advanced Micro Devices has not yet announced its 16-core "Interlagos" Opteron 6200 processors, it has to talk about something, and in situations like that, it is best to talk about the far-off future. And so AMD rounded up a bunch of its partners on Wednesday in San Francisco for a shindig to talk about the challenges of exascale computing.

Chuck Moore, CTO in the chip maker's Technology Group, did the talking about exascale, or the desire to create machines that can deliver more than 1,000 petaflops of number-crunching performance. Moore was one of the lead architects of the "Bulldozer" core used in the forthcoming Opteron processors, as well as for the Fusion hybrid CPU-GPU chips, which AMD calls accelerated processing units, or APUs for short.

While Moore is hot on GPUs, he said this is not something new, so much as a return to the past with a twist. "GPU computing is still in its infancy," Moore explained. "Instead of thinking of computing on a GPU, you should be thinking of this as a revival of vector computing. Going forward, we will be developing GPUs that look more like vector computers."

That got a big hallelujah from Peg Williams, senior vice president of HPC Systems at supercomputer Cray, a descendent of one of the companies founded by Seymour Cray - the legendary supercomputer maker from Control Data (and the company that bears his name) and a man who forgot more about vector processors than most experts will ever know.

The issue, said Moore, is not getting to exascale performance, but getting to exascale performance within a 20 megawatt power budget by 2020 or so.

When Moore and his colleagues were thinking about the design of the Bulldozer core, they did some math and figures that to get somewhere between 1 petaflops to 10 petaflops of performance would eat up around 10 megawatts of power, depending on the system architecture, interconnect, and the scalability of the application software running on the massive cluster.

At the midpoint of the performance range, you are talking about needing 200 megawatts to power up an exascale machine. At $1 per watt per year, which is what these supercomputer labs cost, you are talking about needing $200m a year just to turn the machine on and cool it. So clearly, scaling up high-end x86 CPUs – or any big fat RISC chip – is not the answer.

That's why Oak Ridge National Lab's "Titan" machine is a mix of CPUs and Nvidia GPUs embodied in a Cray XK6 chassis. That machine is expected to scale from 10 to 20 petaflops. At 20 petaflops, about 85 per cent of the oomph in the Titan machine will be coming from the GPUs. CPUs, which handle all the serial work to keep the GPUs fed with numbers to crunch.

The AMD plan, says Moore, is to get a 10 teraflops Fusion APU into the field that only consumes 150 watts, and to use this as the basis of an exascale machine. "You start to think that maybe we can get there," said Moore, saying that he would put a stake in the ground and predict an exascale system could be built by 2019 or 2020.

The issue is not the CPU or the GPU, but rather memory bandwidth between the two devices and between the main memory they will share. This will involve stacking memory in 3D configurations on a chip package with these CPUs and GPUs.

"That is technology that doesn't exist today, but it will be here in time," Moore predicted.

The other trick will be to have a single memory address space for the GPUs and CPUs, but perhaps using different memory technologies to create different segments of main memory that would be more suited to CPUs or GPUs, and let the system try to use those blocks whenever possible.

This idea, like all others, is not new, of course. There are probably many examples, but the one that comes to my mind is the single-level storage architecture of the System/38, AS/400 and Power Systems machines from IBM. They treated cache, main, and disk storage as a single addressable space, meaning that programmers didn’t have to worry about pointers and moving data from disk to memory and back.

It was done automagically by the operating system so RPG programmers could focus on the business logic in their programs instead of worrying about data management. This is precisely the goal that everyone has for future supercomputer applications that span multiple computing architectures.

The use of Fusion APUs in supercomputers got its start today. Penguin Computing, an AMD reseller, announced that it has sold a 59.6 teraflops system to Sandia National Labs, one of the big US Department of Energy compute facilities. The 104-node system is based on AMD's A8-3850 APU and is plunked into the Altus 2A00 rack server. And yes, it can play Crysis. ®

Reducing security risks from open source software

More from The Register

next story
Sysadmin Day 2014: Quick, there's still time to get the beers in
He walked over the broken glass, killed the thugs... and er... reconnected the cables*
Amazon Reveals One Weird Trick: A Loss On Almost $20bn In Sales
Investors really hate it: Share price plunge as growth SLOWS in key AWS division
Auntie remains MYSTIFIED by that weekend BBC iPlayer and website outage
Still doing 'forensics' on the caching layer – Beeb digi wonk
SHOCK and AWS: The fall of Amazon's deflationary cloud
Just as Jeff Bezos did to books and CDs, Amazon's rivals are now doing to it
BlackBerry: Toss the server, mate... BES is in the CLOUD now
BlackBerry Enterprise Services takes aim at SMEs - but there's a catch
The triumph of VVOL: Everyone's jumping into bed with VMware
'Bandwagon'? Yes, we're on it and so what, say big dogs
Carbon tax repeal won't see data centre operators cut prices
Rackspace says electricity isn't a major cost, Equinix promises 'no levy'
prev story

Whitepapers

Designing a Defense for Mobile Applications
Learn about the various considerations for defending mobile applications - from the application architecture itself to the myriad testing technologies.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Top 8 considerations to enable and simplify mobility
In this whitepaper learn how to successfully add mobile capabilities simply and cost effectively.
Seven Steps to Software Security
Seven practical steps you can begin to take today to secure your applications and prevent the damages a successful cyber-attack can cause.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.