Cray notches up two Urika graph analysis appliance sales
Will invest $15m in YarcData 'threadmonsters' in 2013
It has been a busy couple of weeks for supercomputer maker Cray, with the rollout of the 27-petaflops "Titan" supercomputer, the debut of the next-generation XC30 system, and the acquisition of sometime HPC rival Appro International. News that Cray had sold two of its Urika big data analytics appliances got a little lost in the shuffle, but the company is finding buyers for the massively multithreaded systems.
The Urika appliance is not a big-data muncher, like a cluster running Hadoop with scads of disk drives and petabytes of data. Those machines shift through mountains of data looking for particular things – like the kinds of things you respond to in advertising so advertisers know what ads to serve you in the future. With the Urika appliance, the system is designed with completely differently tech to do a very different job, mainly seeking the relationships between different bits of data.
This is called graph analysis, and Cray has taken the XMT-2 "ThreadStorm" processors that are based on an architecture created in the late 1980s by Tera Computer – which ate Cray in 2000 and took its name – and put it to work on a class of problems that are different from the ones it was designed to solve.
Tera was a pioneer in multithreading technology, and the Urika appliance, as the commercial version of the XMT-2 system is now known, is a thread monster. If you want to walk through a dataset and find relationships between bits of data – such as a web of how people are linked to each other – it is best to do that all from within a single memory space and to have lots of threads to do it.
If you try to do it over a traditional cluster of servers, every time you have to leave a node's main memory to go find a bit of data, you have to go out over the system bus, into network controllers, over the network, and back down into another node to get this data. The network is two orders of magnitude slower than main memory. So if you can keep it all of the data on a tightly linked cluster of ThreadStorm processors with one shared memory space, its much better for graph analysis. (You can find a detailed analysis of the Urika machines here.)
The Urika appliances use the "SeaStar2+" 3D torus interconnect that is now two generations back behind "Gemini" and "Aries" interconnects used in current Cray systems. But that doesn't matter. The XMT-2 processors that were in beta last year and that launched in the spring run at 500MHz, so the SeaStar2+ interconnect is plenty fast enough; each processor has 128 threads and plugs into an Opteron Rev F socket.
Four of these XMT-2 processors slot into a Cray XT5 node, which has external DDR2 memory controllers welded onto it because the ThreadStorm chips don't have their own memory controllers. Each XMT-2 blade maxes out at 256GB of memory (four sockets, each with four slots that max out at 16GB apiece) and 512 threads. Four of these blades go into a rack. The SeaStar2+ interconnect can lash together as many as 8,192 processors into a single address space with 128TB of main memory and 1.05 million threads.
This is the kind of box that Facebook and LinkedIn probably want, and that the CIA and NSA probably already have.
The XMT-2 machines and their Urika appliance version are not just used for graph analysis, but are also suited to data mining, pattern matching, power grid analysis, and other kinds of crunching where many bits of data have to be compared quickly – and it does particularly well if that data needs to be accessed randomly and is in an unstructured format.
The first XMT-2 box was sold to the Swiss National Supercomputing Centre (CSCS) back in February 2011, its precise configuration was unknown at the time. But as it turns out, the "Matterhorn" machine at CSCS has 64 processors and 2TB of memory (PDF), which makes it a four-rack system.
Noblis, a non-profit technical consulting company based outside of Washington, DC, has a 128-processor XMT-2 machine that spans two racks.
Back in January 2011, Cray said that IBM was looking at using the MTA-2 chips inside of its Netezza data warehousing machines, and that news and legal database provider LexisNexis was also looking at the XMT-2 machines. No word on what happened here, but Cray hasn't announced any deals yet, so presumably they did not close.
Cray's Urika graph appliance
But last week, Cray inked two more deals for the Urika appliances based on the XMT-2 hardware, and that is good news for its YarcData (yes, that's Cray spelled backwards and doesn't mean anything) division, which was established back in February. Arvind Parthasarathi, who was previously senior vice president and general manager of Informatica's Master Data Management (MDM) business unit, was put in charge of this division and given the task of commercializing the XMT-2 line.
The first new sale for Urika was at Oak Ridge National Laboratory, and it has nothing whatsoever to do with the Titan supercomputer that is now the most powerful machine in the world. Oak Ridge will be using its threadmonster to conduct research on fraud detection in the healthcare industry.
"Identifying healthcare fraud and abuse is challenging due to the volume of data, the various types of data, as well as the velocity at which new data is created," explained Jeff Nichols, associate laboratory director for computing and computational sciences at Oak Ridge statement. "YarcData's Urika appliance is uniquely suited to take on these challenges, and we are excited to see the results that will come from the strategic analysis of some very large and complex data sets."
The Oak Ridge Urika machine will have 64 processors with a total of 8,192 threads all accessing the same 2TB of shared memory. The machine will have its own dedicated 116TB file system that also has nothing to do with the Titan system.
The Pittsburgh Supercomputing Center has also just deployed its own Urika appliance nicknamed "Sherlock," which is being funded by the National Science Foundation's Strategic Technologies for Cyberinfrastructure (STCI) program. This machine has a custom configuration of the XMT-2 hardware that puts 32 processors with a total of 4,096 threads against 1TB of shared memory. PSC will carve Sherlock into sixteen separate partitions for application development – that's the customization part – and link it to its own file systems.
Cray has not said what the Urika appliances cost or what its revenue expectations are for the line, but after many years of development and presumably funding from the US government to cover the costs of that development, Cray is clearly ready to cash in on that investment. (It's a good business: get the government to pay for development, and then charge the government again to build and support systems while being allowed to peddle machines using government-sponsored technology to other academic and commercial institutions.)
For the moment, Cray seems to be patient about the YarcData division. But it doesn't take a supercomputer to figure out that Cray set up a separate division earlier this year not just to have a locus of development in Silicon Valley, but to keep the books separately. That could mean any number of things, such as Cray preparing for YarcData for phenomenal growth or for a potential spinoff, or just giving YarcData enough separation so it can build a separate team tackling very different kinds of problems.
In a conference call last week going over Cray's acquisition of Appro and its financial results for the third quarter, CEO Peter Ungaro said that the software stack on Urika had just been tweaked to deliver better performance and that the product was transitioning from early customer adoption to "production-ready systems."
Ungaro said that that Cray would invest another $15m in YarcData in 2013, and said further that "with the growing pipeline, we expect to continue to steadily ramp this business going forward."
That sure doesn't sounds like a spinoff. But if Intel or someone else comes a-knockin' for a massively multithreaded processor, Cray would probably be more than happy to sell it. Just like it was happy to sell its Gemini and Aries interconnect to Intel back in April for $140m. ®
Sponsored: IBM FlashSystem V9000 product guide