Feeds

China to build ANOTHER 100 petaflops hybrid supercomputer by 2014?

Intel Inside the Middle Kingdom

Designing a Defense for Mobile Applications

We passed through the 10 petaflops barrier in the supercomputer racket last year, and the next station on the train to exaflops is 100 petaflops.

China already admitted at last year's International Super Computing '12 shindig that it was working on a kicker to Tianhe-1A hybrid CPU-GPU supercomputer, with the goal of having the Tianhe-2 machine reaching 100 petaflops of peak performance by 2015. And now, the Chinese government, which is flush with trillions of dollars in cash, could be moving the schedule forward by as much as a year – and perhaps with a totally different machine.

A report in Singapore-based VR-Zone by Theo Valich claims inside information on a 100 petaflopper is being commissioned by the Chinese Ministry of Science to be used in space exploration and healthcare research. It will consist of around 100,000 of Intel's "Ivy Bridge-EP" Xeon E5-2600 v2 processors and the same number of the next-generation "Knights Landing" Xeon Phi multicore x86 coprocessors.

Nvidia which helped build the Tianhe-1A ceepie-geepie hybrid machine back in 2010 based on a rack server design created by the National University of Defense Technology (NUDT). The machine had 86,016 Xeon cores on the CPU side and 100,352 Tesla M2050 cores on the GPU side for a total of 7,168 CPUs and GPUs. This machine delivered a peak theoretical performance of 4.7 petaflops and 2.57 petaflops on the Linpack Fortran benchmark.

Tianhe-1A uses a proprietary interconnect, and is the machine that put Chinese HPC on the petaflops map, even though many thought at the time that this initial box was just a publicity stunt to get a CPU-GPU box on the top of the list and not from an American or European institution. China has also created its own homegrown variants of Sparc (which are in the Tianhe-1A cluster) and MIPS processors called Godson 3B that are aimed at everything from mobiles to supers.

Nvidia refused to talk about this prospective 100 petaflops machine or the Tianhe-2 box, if they are indeed different machines at all. Intel would not talk about this Chinese 100 petaflops box, either. "Intel does not comment on rumors and speculations," was what an Intel spokesperson told El Reg when we brought up the rumors about this Chinese machine and the further rumor that the processor and coprocessor technology would only cost $100m combined.

This is ridiculously cheap, and is almost certainly something that has gotten lost in translation somewhere. Either that, or Intel is making the chips in a Chinese fab and is getting all kinds of breaks from the Beijing government. Or is using a supercomputer as a loss leader for some other effort in China like convincing cell phone and microserver makers to use Atoms instead of ARMs.

Assuming that the Ivy Bridge-EP processors shift to ten cores, up from eight with the "Sandy Bridge-EP" Xeon E5-2600s, you would need a mere 5,000 two-socket server nodes to get to 100,000 cores. These nodes would not contribute much in the way of performance of the system. If the clock speeds of the two Xeon E5 families are the same, then these 5,000 nodes would be on the order of 2.1 petaflops, tops. Now, the story in VR-Zone says "approximately 100,000 Ivy Bridge-EP based Xeon E5s," which is a staggering 50,000 server nodes and a mind-blowing 2 million Xeon cores for a total of 21 petaflops of aggregate performance. That leaves another 80 petaflops or more that you need to get from the Xeon Phi coprocessors.

Let's do some math. The current "Knights Corner" Xeon Phi coprocessors have 60 active cores (on a die with 64 cores) running at 1.05GHz delivering just over 1 teraflops of oomph. To get 80 petaflops of peak number-crunching oomph, you would need 75,973 Xeon Phi cards, which would work out to around 1.5 Xeon Phi cards per node. Call it two Xeon Phi's in the Knights Corner generation per node just for fun, and that alone gives you 105.3 petaflops with the current generation.

Now, you know Intel won't sit still, and it will likely add more cores to the Knights Landing Xeon Phi. The Knights Corner chips are already etched in 22 nanometer processes, so Knights Landing has to either use the same process and have architectural improvements or move to a new process and do a shrink and a core count boost.

We think it will be the former rather than the latter, and so let's be optimistic and say that Intel can goose the performance of the Xeon Phis by as much as 25 to 30 per cent without moving to 14 nanometers. So with 100,000 Xeon Phi v2 coprocessors, you'd be at somewhere around 135 petaflops on the Xeon Phis and another 21 petaflops on the server nodes. Now you are pushing up to 156 petaflops peak.

It is possible that China is working on such a machine, but it is hard to imagine that it will cost as little as $100m for the processing elements. If you bought 100,000 Xeon Phi v1 coprocessors at list price based on 1,000-unit trays, you'd pay $265m, and 50,000 server nodes would run you maybe another $300m depending on the memory and networking if you bought them as onesies online.

Assuming China is using its own proprietary interconnect, you might be able to get the base servers at $250m list. Call it a cool $515m at list for both the servers and the Xeon Phis, and maybe with a 45 per cent discount and some rounding, you could get it down to $285m.

Whatever China is doing, it will make it known in its own good time. ®

The Power of One eBook: Top reasons to choose HP BladeSystem

More from The Register

next story
Apple fanbois SCREAM as update BRICKS their Macbook Airs
Ragegasm spills over as firmware upgrade kills machines
Attack of the clones: Oracle's latest Red Hat Linux lookalike arrives
Oracle's Linux boss says Larry's Linux isn't just for Oracle apps anymore
THUD! WD plonks down SIX TERABYTE 'consumer NAS' fatboy
Now that's a LOT of porn or pirated movies. Or, you know, other consumer stuff
EU's top data cops to meet Google, Microsoft et al over 'right to be forgotten'
Plan to hammer out 'coherent' guidelines. Good luck chaps!
US judge: YES, cops or feds so can slurp an ENTIRE Gmail account
Crooks don't have folders labelled 'drug records', opines NY beak
Manic malware Mayhem spreads through Linux, FreeBSD web servers
And how Google could cripple infection rate in a second
FLAPE – the next BIG THING in storage
Find cold data with flash, transmit it from tape
prev story

Whitepapers

Designing a Defense for Mobile Applications
Learn about the various considerations for defending mobile applications - from the application architecture itself to the myriad testing technologies.
How modern custom applications can spur business growth
Learn how to create, deploy and manage custom applications without consuming or expanding the need for scarce, expensive IT resources.
Reducing security risks from open source software
Follow a few strategies and your organization can gain the full benefits of open source and the cloud without compromising the security of your applications.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
Consolidation: the foundation for IT and business transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.