Feeds

China takes HPC heavyweight title

GPUs, Arch interconnect knocks out Jaguar and Roadrunner

Security for virtualized datacentres

If it wasn't immediately obvious that China is a superpower, today's announcement that the Tianhe-1A CPU-GPU hybrid is the most powerful supercomputer in the world - and by a comfortable margin - will make it abundantly clear.

China wants to move from being a manufacturing powerhouse to being a full player in the 21st century technological economy, and it is making the investments to transform itself.

The National Supercomputer Center in Tianjin, China, this morning rolled out the Top 100 rankings of the country's fastest supercomputers (based on the Linpack Fortran benchmark test, like the global Top 500 list). The Tianhe-1A (which is translated from Chinese for "River in the Sky" or "Milky Way" with a model number slapped on it) beat out all of its rivals. The supercomputer is based on a rack server design created by the National University of Defense Technology (NUDT), and comprises 14,336 Xeon processors and 7,168 of Nvidia's Tesla M2050 fanless GPU co-processors.

The resulting machine has a peak theoretical performance of 4.7 petaflops, which is a gargantuan amount of raw performance, but where the rubber hits the road on the Linpack test, the machine delivers 2.51 petaflops.

That means 47 per cent of the theoretical performance of the machine is going up the chimney. This is not particularly good. But with CPU-GPU clusters costing roughly about a quarter of the cost of CPU clusters, according to Sumit Gupta, product marketing manager for the Tesla product line, on teraflops-for-teraflops basis, the inefficiency can be tolerated to make up for scalability. For now, at least.

Coders and hardware engineers the world over will now be trying to boost efficiencies on the PCI-Express bus, on the system interconnects, and in the software stack to get the sustained performance a lot closer to the peak for ceepie-geepie hybrid machines. Gupta says that the GPUs are responsible for around 70 per cent of the calculations that were done on the Linpack test.

Like the USS Enterprise, the Tianhe-1A, as the name suggests, is not the first hybrid parallel super that China has put into the field. The Tianhe-1 cluster, based on Intel Xeon chips and Advanced Micro Devices Radeon HD 4870 GPUs, broke onto the Top 500 list in November 2009. That machine had 71,680 cores and had a peak theoretical performance of 1.2 petaflops and a sustained performance of 563.1 teraflops. In that case, 53 per cent of the aggregate performance went up the chimney.

China's Tianahe-1A Supercomputer

The Tianhe-1A CPU-GPU hybrid super

The Tianhe-1A super is not important just because it is now the fastest supercomputer in the world, but because NUDT has spent years developing its own proprietary interconnect for the server nodes. And as El Reg previously reported, a future generation of Tianhe machines will use a homegrown multi-core processor, called Godson and based on the MIPS core. (So when does China's Institute of Computing Technology, part of the Chinese Academy of Sciences, start making its own GPUs?)

According to sources at Nvidia, which had people on the floor at the unveiling of Tianhe-1A in China this morning, the proprietary interconnect is called Arch and it links the server nodes together using optical-electric cables in a hybrid fat tree configuration. The switch at the heart of Arch has a bi-directional bandwidth of 160 Gb/sec, a latency for a node hop of 1.57 microseconds, and an aggregate bandwidth of more than 61 Tb/sec.

Some people have been suggesting that this interconnect somehow links the GPUs to the CPUs, but I am fairly certain that the GPUs hook to the CPUs by the plain old PCI-Express 2.0 bus in the server nodes. It would be very interesting if this interconnect has something akin to Remote Direct Memory Access, which allows a node to reach into and directly talk over the PCI-Express bus to the memory in a GPU in another node. Nvidia didn't mention this, and no one else has either, but that could significantly speed up performance if the Arch switch has such a feature.

The Tianhe-1A super has an aggregate of 262 TB of main memory and 2 PB of storage implemented as a Lustre clustered file system. The machine is comprised of 112 compute racks, eight storage node cabinets, six communications racks, and 14 I/O racks.

I personally welcome our Chinese HPC overlords. It's hard not to when my government owes their government $2 trillion, right? ®

Providing a secure and efficient Helpdesk

More from The Register

next story
Docker's app containers are coming to Windows Server, says Microsoft
MS chases app deployment speeds already enjoyed by Linux devs
IBM storage revenues sink: 'We are disappointed,' says CEO
Time to put the storage biz up for sale?
'Hmm, why CAN'T I run a water pipe through that rack of media servers?'
Leaving Las Vegas for Armenia kludging and Dubai dune bashing
Facebook slurps 'paste sites' for STOLEN passwords, sprinkles on hash and salt
Zuck's ad empire DOESN'T see details in plain text. Phew!
Windows 10: Forget Cloudobile, put Security and Privacy First
But - dammit - It would be insane to say 'don't collect, because NSA'
Symantec backs out of Backup Exec: Plans to can appliance in Jan
Will still provide support to existing customers
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Why and how to choose the right cloud vendor
The benefits of cloud-based storage in your processes. Eliminate onsite, disk-based backup and archiving in favor of cloud-based data protection.
Three 1TB solid state scorchers up for grabs
Big SSDs can be expensive but think big and think free because you could be the lucky winner of one of three 1TB Samsung SSD 840 EVO drives that we’re giving away worth over £300 apiece.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.