Nuke-whisperers stuff terabytes of flash into heretical 'Catalyst' super

Intel, Cray, and Lawrence Livermore rethink supercomputer design

Internet Security Threat Report 2014

Three technological heavyweights have come together to spin-up a radically different supercomputer cluster designed to crunch "big data" workloads rather than the simulation and modeling jobs of typical HPC rigs.

The collaboration between Lawrence Livermore National Laboratory, Intel, and Cray was announced on Monday and sees the companies fire up a high-performance computing system named 'Catalyst' that has an order of magnitude more memory than any system that has gone before it.

Catalyst has 304 dual-socket compute nodes equipped with 2.4Ghz 12-core Xeon E5-2695v2 processors backed by 128GB DRAM, along with the Intel TrueScale Fabric. So far, so super – what makes this system different is the whopping 800GB of flash memory attached via PCIe per node. Boffins want to convert slabs of solid-state storage into a secondary tier of memory.

Intel, Cray, and LLNL are going to use the system to crack "big data" problems, and in doing so investigate the way that new systems can be designed to take advantage of much faster memory mediums – a crucial investigation, given the likely arrival of some form of next-generation non-volatile RAM (such as HP's Memristor) in the next few years.

Initially, LLNL will use the system to test out a new "data intensive" technique of mapping the solid-state drives into application memory, "making the flash stores look like standard DRAM" to software, Matt Leininger, a deputy for Advanced Technology Projects within LLNL told us. Though he stressed that apps "need some smarts about what it caches in DRAM versus the flash. This machine is a way to scale that [approach] out from two to three to five nodes to several hundred."

The system works along the lines of hardware from the likes of Fusion-io, which brings faster-than-disk capacities to almost-as-fast-as-RAM memory for software to shift data around in. One area of concern is the aforementioned difference in access times between DRAM and the attached Intel flash, which will require new ways to juggle memory allocation in big apps, Leininger admitted.

So, what does all of this have to do with "big data"?

"In traditional HPC the simulation and modeling techniques are typically based on scientific models that have underlying mathematics or physics partial differential equations," Mark Seager, chief technology officer of Intel's advanced computing group, says. "That starts out with a very small amount of data and evolves over-time in a time-stepping manner generating lots and lots of data as it progresses."

"In that environment, for that type of computation, you really want to maximize floating-point operations per second per dollar that you invest in. The second most important investment there is interconnect, then memory and IO."


Small cluster, big memory

But with big-data applications where the cluster must analyze a ton of data that has been generated elsewhere and streamed in – for instance, telemetry from nationwide utility grids, or by geophysical exploration – the infrastructure demands almost reverse. Fast memory – and lots of it – becomes a priority.

"You start with a big amount of data and typically it's on disk and when you do the computation you have to figure out an efficient way to get it off the disks and into the filesystem," Seager says. "Disk is woefully slow and getting slower... NVRAM is an opportunity to get very fast random access to that data."

This approach represents a "major departure from classic simulation-based computing architectures common at US Department of Energy laboratories and opens new opportunities for exploring the potential of combining floating point focused capability with data analysis in one environment," Intel wrote in a statement announcing the system. "Consequently, the insights provided by Catalyst could become a basis for future commodity technology procurements."

Along with node getting access to 800GB of NVRAM, the system comes with dual rail Quad Data Rate (QDR-80) networking fabric, which gives each CPU its own dedicated I/O service. Previously, one socket would get the direct network link and the secondary one would have to talk across QPI.

"By having the dual rail one per socket tightly coupled we can do [stuff] with those flash devices without having to cross the QPI socket," Seager said. "We can double the effective messaging rate."

The combination of this fabric technology with the NVRAM gives Catalyst a cross-cluster bandwidth of half a terabyte per second, which is equivalent to the original incarnation of LLNL's whopping 16-petaflop "Sequoia" system which was the world's fastest HPC rig in June 2012.

The difference is the bandwidth achieved for Catalyst is "an order of magnitude less expensive because the filesystem for Sequoia is based on rotating disks," Seager said.

The full Cray CS300 cluster is capable of 150 teraflops using 304 compute nodes, 12 Lustre route nodes (128GB RAM and 3,200GB NVRAM), two login nodes (128GB DRAM), and two management nodes. Each compute node gets 800GB of NVRAM. The NVRAM comes from Intel's SSD 910 Series 800GB 1/2 height PCIe 2.0, multi-level cell flash.

Catalyst's arrival is sure to delight Jean-Luc Chatelain, an executive vice president at DataDirect Networks, who predicted to El Reg a year ago that 2014 would see the arrival of NVRAM as a major storage tier for HPC data. ®

Internet Security Threat Report 2014

More from The Register

next story
Docker's app containers are coming to Windows Server, says Microsoft
MS chases app deployment speeds already enjoyed by Linux devs
IBM storage revenues sink: 'We are disappointed,' says CEO
Time to put the storage biz up for sale?
'Hmm, why CAN'T I run a water pipe through that rack of media servers?'
Leaving Las Vegas for Armenia kludging and Dubai dune bashing
Facebook slurps 'paste sites' for STOLEN passwords, sprinkles on hash and salt
Zuck's ad empire DOESN'T see details in plain text. Phew!
Windows 10: Forget Cloudobile, put Security and Privacy First
But - dammit - It would be insane to say 'don't collect, because NSA'
Symantec backs out of Backup Exec: Plans to can appliance in Jan
Will still provide support to existing customers
VMware's tool to harden virtual networks: a spreadsheet
NSX security guide lands in intriguing format
prev story


Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Cloud and hybrid-cloud data protection for VMware
Learn how quick and easy it is to configure backups and perform restores for VMware environments.
Three 1TB solid state scorchers up for grabs
Big SSDs can be expensive but think big and think free because you could be the lucky winner of one of three 1TB Samsung SSD 840 EVO drives that we’re giving away worth over £300 apiece.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.