Japanese nuke lab erects 200 teraflop super

Heads for 'Venus'

7 Elements of Radically Simple OS Migration

Server maker Fujitsu has announced that the Japan Atomic Energy Agency will be building a 200 teraflops cluster based on Intel's 'Nehalem EP' Xeon 5500 processors and Fujitsu's blade form factor. JAEA is also buying two Sparc-based clusters, foundations for even larger petaflops-scale supers that Fujitsu plans to build using its future 'Venus' eight-core Sparc64-VIII processors.

Today, JAEA relies on two clusters of a more modest variety - one offering 13.1 teraflops of performance, the other 2.4 teraflops. Considering how important nuclear power is to Japan and the amount of computing capacity that the United States, the United Kingdom, and France use when doing nuclear research, these are are relatively puny cluster. But they're not as puny as you might think. Whereas a lot of the nuclear research that the Western nations do involves weapons, Japan just does research on fission and fusion reactors and how to best handle nuclear fuel and waste.

But JAEA is still looking for more power, and it has aspirations of reaching the petaflops level to advance its research for fission and fusion reactors.

The biggest machine that JAEA currently has is an Altix 3700 Bx2 shared memory system from Silicon Graphics. This box uses 2,048 single-core 1.6 GHz Itanium 2 processors and has a mere 13 TB of memory matched up against its 13.1 teraflops of number-crunching power (peak, not sustained).

It is now four years old and looking long in the tooth. In June 2005, when it was installed, the machine ranked 15th on the Top 500 supers list, but it fell off the list in November 2008. The agency also has a 2.4 teraflops cluster of unknown technology that is used specifically to simulate its fast breeder reactor.

But this iron will soon be history. The agency has contracted with Fujitsu to build a parallel Linux super based on its new Primergy BX900 Dynamic Cube blade servers, which were announced in early May. The plan calls for JAEA to install 2,157 blades using the quad-core 2.93 GHz X5570 processors (the fastest 95 watt versions of the Nehalem EPs), for a total of 17,256 cores. The nodes will be linked together using quad data rate (40 Gb/sec) InfiniBand switches, and the resulting cluster will have a peak theoretical performance of 200 teraflops.

This machine - which will be operational in March 2010 - will be used to do nuclear fusion simulations, and JAEA estimates that its simulation code will require a minimum of 100 teraflops to run. JAEA might be installing a Linux-x64 cluster now, but it looks like it's making some bets on future Fujitsu supercomputer nodes and shared memory systems.

JAEA is also installing Sparc Enterprise M9000 machine rated at 1.92 teraflops and using the current quad-core Sparc64-VII processors as a big memory box and is adding a network of FX1 single-core Sparc64-VII server nodes as the test bed for a future petaflops-scale super that JAEA plans to install. That development cluster will have 320 nodes and 1,280 cores and is rated at 12 teraflops of peak performance. The Sparc machines presumably will run Solaris.

All of the machines are managed using Fujitsu's own Parallelnavi cluster and job management software, and they share access to 1.2 petabytes of Fujitsu's Eternus DX80 disk arrays.

When it is operational, the 200 teraflops Xeon-Linux super will be the most powerful machine in Japan. But JAEA is implying that it is going to reach for petaflops, and to do so, it will be using the Sparc architecture, not x64 chips.

As El Reg reported back in May, Japanese server makers NEC and Hitachi have both pulled out of the $1.2bn Next Generation Supercomputer Project that is being sponsored by the Japanese government to create a hybrid scalar and vector supercomputer involving NEC and Hitachi for vector supers and Fujitsu and its eight-core Venus Sparc64-VIII chips for the scalar half. After doing all the design work for the vector half of this Project Keisoku machine - which was intended to scale to 10 petaflops of peak performance - NEC said in May after reporting an $8bn loss in its fiscal 2009 year ended in March that it could not actually manufacture the vector half of the Keisoku system without incurring losses and walked away from the deal along with partner Hitachi.

That leaves Fujitsu and the Rikagaku Kenkyusho (Riken) research lab in Kobe, Japan saying they will build the fastest scalar computer in Japan, presumably using the Sparc64 Venus chips. It was no accident that Fujitsu was touting the Venus design the day before NEC and Hitachi announced they were ditching the project for financial reasons.

It may look like JAEA is going to follow the lead of the Riken lab, based on the development machine it is installing alongside the new Linux-x64 cluster. But in fact, the JAEA Sparc64 machines will be doing some of the software application development groundwork for the Keisoku system, which is expected to be operational in early 2012.

JAEA stopped short of saying that it would eventually replace the Linux-x64 machine with a giant Sparc64 box. This is the supercomputer business, where technology decisions are based on budgets and politics as much (or maybe more) than on technology. No matter what JAEA does, it is clear that it has to port its code off Itanium processors onto something, and you can bet that Silicon Graphics wants to peddle its future 'UltraViolet' Xeon-NUMAflex shared memory machines (the kickers to the Altix) to the nuke lab. But it looks like the political tide has shifted, and Japan is looking for homes for indigenous products, and that means SGI is facing a tough sell. Maybe an impossible one. ®

Best practices for enterprise data

More from The Register

next story
Microsoft's Euro cloud darkens: US FEDS can dig into foreign servers
They're not emails, they're business records, says court
Sysadmin Day 2014: Quick, there's still time to get the beers in
He walked over the broken glass, killed the thugs... and er... reconnected the cables*
VMware builds product executables on 50 Mac Minis
And goes to the Genius Bar for support
Multipath TCP speeds up the internet so much that security breaks
Black Hat research says proposed protocol will bork network probes, flummox firewalls
Auntie remains MYSTIFIED by that weekend BBC iPlayer and website outage
Still doing 'forensics' on the caching layer – Beeb digi wonk
Microsoft says 'weird things' can happen during Windows Server 2003 migrations
Fix coming for bug that makes Kerberos croak when you run two domain controllers
Cisco says network virtualisation won't pay off everywhere
Another sign of strain in the Borg/VMware relationship?
prev story


7 Elements of Radically Simple OS Migration
Avoid the typical headaches of OS migration during your next project by learning about 7 elements of radically simple OS migration.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
Solving today's distributed Big Data backup challenges
Enable IT efficiency and allow a firm to access and reuse corporate information for competitive advantage, ultimately changing business outcomes.
A new approach to endpoint data protection
What is the best way to ensure comprehensive visibility, management, and control of information on both company-owned and employee-owned devices?