This article is more than 1 year old

'Urika': Cray unveils new 1,500-core big data crunching monster

6TB of DRAM, 38TB of SSD flash and 120TB of disk storage

Big data analytics people are constantly panning for nuggets of gold, and Cray has just the machine for them — its Urika-XA.

Said to be a single-platform entity, consolidating a wide range of analytics workloads previously needing separate systems, its design has been optimised for compute- and memory-intensive and latency-sensitive workloads.

Cray’s Urika line has been around for four years and is an appliance-like product for analysing big data, essentially chewing through numbers looking for relationships between things. It was introduced by Cray’s YarcData division and called a graph analytics appliance.

Now we have Urika-XA as a turnkey, scale-out, analytics appliance. Cray says it has hot hardware: over 1,500 cores, 6TB of DRAM, 38TB of SSD flash and 120TB of disk storage. Stick the working set data in and munch through it with sharp processing teeth.

The original Urika system is now called the Urika-GD and positioned for graph-based analytics. The XA product is for “extreme analytics” (hence XA) and described as a “pre-integrated, open platform for high-performance big data analytics".

Think of it crudely as a go-faster Urika-GD without the graph stuff.

Archimedes

A single Urika-XA rack features:

  • 48 Intel Xeon compute nodes with an 800gig SSD per node
  • 200TB of SDD and disk storage using Sonexion 900 array
  • InfiniBand
  • Lustre parallel file system, HDFS-compatibility and POSIX compliance
  • High-availability
  • SW stack with Cloudera Enterprise, Apache Spark, Cray Adaptive Runtime for Hadoop and Urika-XA management system

Sonexion storage is OEM’d from the Seagate-acquired Xyratex, and based on its ClusterStor technology.

Urika XA

Cray Urika-XA

Urika-XA has a flagship first customer in the US Dept’ of Energy’s Oak Ridge National Lab. The analysis people there will use it in climate science, materials science and healthcare areas.

An obvious comparison is with DDN’s GS7K appliance, which uses GPFS instead of Lustre for its parallel file system work and can link to an object storage backend.

Red Hat has a scale-out Gluster-using Open Storage Server if you’re enamoured with the Gluster file system and want to build your own kit. It has Hadoop file system plug-in.

IBM’s GPFS-based Elastic Storage Server may also be on the competitive comparison list for enterprise fast-access big data analytics work.

The Cray marketing pitch says it’s coming from supercomputing land with “battle-hardened” technology. The use-case areas -(set bullshit detector alert on) include

  • Financial services for risk management
  • Life sciences — next-gen sequencing for example
  • Governments looking at “the pattern of life” whatever the heck that means
  • Sports with matchup optimisation
  • Telecom with churn analysis
  • Media with data-driven journalism

Data-driven journalism means analysing large data sets and building a news story on the results. Any media organisation that would buy a packaged supercomputer appliance for this has got very deep pockets and a degree of faith that would impress a zealot.

Cray’s basically saying use its supercomputer-derived big data tech because you’ll get the Eureka! moments faster and more often. Whether you will depends on your data boffins asking the right questions. Whatever they ask though, they’ll probably get their answers quicker with this Urika-XA beast.

Cray will happily sell you a multi-rack config and Urika-XA will be available in December. ®

More about

TIP US OFF

Send us news


Other stories you might like