HPC geeks ponder 100 petafloppers and quantum supercomputers

Crowdsourcing prognostications for fun and profit prophet

What the Tianhe-2 super should look like in its final home

ISC 2013 The next big barrier for supercomputing is punching through 100 petaflops peak performance, which frankly could be done in a heartbeat if someone had a few hundred millions dollars lying around. And now that Google and NASA are monkeying around with a quantum computer, thoughts are turning to how a QC might be deployed to replace some of the work done by traditional supercomputer clusters.

To get a sense of what the HPC community thinks about these two barriers, the organizers of the International Super Computing conference in Leipzig, Germany, last week put up a straw poll, asking attendees to mark on a chart when they thought the first 100 petaflops machine would appear on the Top500 rankings of supercomputers, and then followed up with when the first quantum machine would appear on the list.

With China's Tianhe-2 supercomputer having just punched through 50 petaflops of peak theoretical performance, and considerably earlier than anyone expected, thoughts are naturally turning to when a machine will surpass 100 petaflops of aggregate double-precision number-crunching oomph.

The original plan for Tianhe-2 called for the ceepie-geepie to hit 100 petaflops by 2015, and the rumours going around at ISC last week were that this is still China's plan.

With all the cash that China has, it could do whatever it wants schedule-wise for building a 100-petaflopper, which might cost as much as a half billion dollars or more using future processor and GPU or x86 coprocessor technology. You would want future "Haswell" Xeon E5 v3 processors and either "Maxwell" GPU coprocessors from Nvidia or "Knights Corner" x86 coprocessors from Intel in such a machine.

Maxwell GPUs will launch maybe in late 2013 or early 2013, based on a roadmap shared back in March by Nvidia, and will deliver about twice the flops per watt as the current "Kepler" K20 and K20X coprocessors. So you could get a 100 petaflops machine in the same space as Tianhe-2 and burn about the same juice.

The consensus at ISC seems to be that we will see a 100 petaflops in the next two years

The consensus at ISC seems to be that we will see a 100 petaflops in the next two years

It might be tempting to wait until the "Volta" GPU coprocessor kickers to the Maxwells come out. These will have stacked memory on the cards as well as the unified virtual memory between the CPU and GPU that is coming with the Maxwells, and with Voltas offering twice as much double-precision performance per watt as the Maxwells--around four times the current Keplers--you could easily get 200 petaflops in a machine the same size as Tianhe-2.

If the roadmap Nvidia is showing is drawn to scale, then Volta GPUs are not expected until 2016, but to get a big deal and provided it could do so, Nvidia could give the Chinese government early access to just as Intel just did with the "Ivy Bridge-EP" Xeon E5 v2 processors in the Tianhe-2 machine. These are not expected to launch until the late summer or early fall, and yet China has 16,000 of them already.

If Nvidia wants to win deals in China and in the big supercomputing centers in the United States, Japan, and Europe, getting Maxwell and Volta GPUs out sooner rather than later is clearly better. Intel is working on its "Knights Landing" kickers to the Xeon Phi x86 coprocessors, which will be implemented in 14 nanometer technology (allowing either more cores or faster cores or a mix of both) and integrated on-package memory. The expectation is for Knights Landing to debut in the next 18 to 24 months, which is plenty of time to make it to a Top500 list in 2015.

The Chinese government is not the only one trying to push up above 100 petaflops. The general plan at Oak Ridge National Laboratory two years ago was to get a machine, dubbed OLCF-4, into the field with between 100 and 250 petaflops based on DARPA HPCS technology.

That means the Cray "Cascade" XC30 machine with the "Aries" interconnect, and the plan was for that to be done after 2015. Oak Ridge likes to get a new machine every three years, and the "Titan" ceepie-geepie built by Cray was just fired up last fall. So expect a shiny new Cray Cascade at Oak Ridge in late 2015, probably with an upgraded Aries interconnect, probably with "Broadwell" Xeon E5s, and with Maxwell GPU or Intel Knights Landing x86 coprocessors doing most of the heavy lifting on calculations.

The other 100 petaflopper on the horizon is the "NERSC-8" kicker to the "Edison" 2 petaflops machine that is just being fired up this past weekend at the National Energy Research Scientific Center that is located at Lawrence Berkeley National Lab. The first phase of the Edison machine, which is a Cray XC30 box with 664 compute nodes, was based on the current eight-core "Sandy Bridge-EP" Xeon E5 v1 processors, and the fact that the phase two construction page doesn't say what processors are being used is a very strong indication that it will be the same twelve-core Xeon E5 v2 chips that Tianhe-2 uses.

Anyway, the node count for the full Edison machine is 5,200 boxes, with a an order of magnitude boost in core count, and if you do the math, the twelve-core Ivy Bridge has to be going in, and it will have a peak performance of 10 petaflops.

Berkeley Lab is aiming for 100 petaflops with its NERSC-8 box, which three years ago was tentatively scheduled for 2017 or so with a strong possibility of an earlier delivery. In the latest NERSC roadmap embedded in the RFP from last November, that schedule was moved up to late 2015 or early 2016. Vendors are expected to be selected in the third or fourth quarter of this year, and it is highly likely it will be a Cray Cascade with future Intel processors and future Nvidia or Intel coprocessors.

And, as it turns out, the RFP for NERSC-8 will also include a second system called "Trinity" that is expected to be installed at Los Alamos National Laboratory to help manage the US government's stockpile of nuclear weapons. The exact floppage of Trinity has not been divulged, but this is a winner-take-all deal and it looks like Trinity will be about twice the size of NERSC-8. So that should be in the range of 200 petaflops.

With China stepping up the pace, there will be strong political pressure for the Obama Administration to come up with the funds for these three massive machines at Oak Ridge, Berkeley Lab, and Los Alamos, and you can bet Cray is happy to think about those prospects seeing as though it is on the inside track. Maybe IBM has plans to make more Power-based supercomputers, but Big Blue has been pretty quiet about any plans for BlueGene/Q successors. The company could put a 100 petaflopper on the floor today if someone could pay for it.

Now, for some fun, or silliness. When did the ISC attendees think a quantum computer would appear on the Top500 list?

The jury is still out on when a quantum super might materialize

The jury is still out on when a quantum super might materialize

The consensus seems to be anywhere from after 2020 to, um, never. And maybe "never" is the better answer for a few reasons. Maybe a quantum computer is not a supercomputer in the traditional sense, even if it can do certain kinds of probabilistic calculations at blazing speeds. And as El Reg pointed out, the D-Wave quantum system being installed by Google and NASA for $15m is, strictly speaking, a quantum-ish computer, not properly quantum, that is good at pattern-matching and other machine-learning jobs as well as solving equations with blazing speed if they map well to the quantum hardware.

All this server hack knows is he needs to do a lot more reading up on this quantum computing gobbledygook, and I was clearly not alone in the ISC crowd in that regard. ®

Sponsored: Balancing consumerization and corporate control

Biting the hand that feeds IT © 1998–2019