I'm the world's fastest! No, I am! And I'm staggering, too!
HPC works of heartbreaking genius argue as HP goes SMB
ISC 2012 Xyratex has formally launched its ClusterStor high-performance computing drive arrays, saying it's the fastest data storage array for high-performance computing in the industry. At the other end of the scale HP has revved its X5000 NAS filer upping capacity and adding iSCSI SAN access.
Xyratex' ClusterStor 6000 is a pre-configured, rack-level storage cluster, that offers from 6GB/sec to 1TB/sec Lustre file system processing capability. It's speed claim conflicts with that of Panasas which produces, it says, "the world's fastest parallel storage system."
Panasas more explicitly claims it can scale "performance to a staggering 150GB/s, the industry's highest single file system throughput per terabyte of enterprise SATA storage."
The ActiveStor 12 product, using the PanFS filesystem, scales to 1.6GB/sec for writes and 1.5GB/sec for reads, roughly half as fast again as the ClusterStor 6000
Ken Claffey, Xyratex' business line manager for ClusterStor products, said:
"We claim 2x the performance based on the fastest competitor we could find which was the DDN SFA12ke which claims 20GB/sec file system performance per rack."
A DDN SFA12K-20e rack consists of ten 4U base enclosures and does 20GB/sec in a single system. The SFA12K-40 does 40GB/sec in a single system.
Xyratex' ClusterStor 6000 achieves 6GB/sec from one 5U SSU, Single Storage Unit or base enclosure, and there can be seven in a rack with promised linear performance scalability, meaning 42GB/sec per rack. This is, indeed 2x, better than 2x, the DDN SFA12K-20e's performance and a smidgin faster than the SFA12K-40's 40GB/sec.
The Panasas PAS 12 does up to 15GB/sec per rack according to an ESG validation paper and so Xyratex' claim is confirmed.
We now have a 3-way race to produce cost-effective, space-efficient and high-performance HPC storage arrays with open source Lustre-based systems from DDN and Xyratex positioned against the management ease of the integrated software/hardware products from Panasas.
HP X5000 unified storage
HP has boosted its X5000 G2 filer appliance, the one using Windows Storage Server 2008 introduced in November last year, by adding:
- Small Form Factor (SFF) drive chassis with 36 X SFF disk drive slots
- Three orderable product configs based on that
- Support for 3TB 7,200rpm 3.5-inch disk drives in the Large Form Factor (FLL) X5000 G2 chassis, meaning a 48TB to 192TB capacity range â¢ Orderable X5000 G2 LFF config with eight 3TB drives.
The X5000 has a file deduplication feature that HP says can recover up to 40 per cent of the system's capacity. It makes the point that, as the X5000 is Windows-based, it's fully compatible with Windows Active Directory, Distributed File System (namespace and replication), Microsoft System Center and more.
The system can also do iSCSI block access; it having a Microsoft iSCSI software target inside.
The 48TB capacity model uses four 300GB 2.5-inch 10K SAS drives and 16 2TB 7.2K 3.5-inch drives. Each X5460sb controller blade has two x 2.5-inch drives pre-loaded with the Windows OS and are not used for data. That means 32TB is available for user data.
HP's X5000 FAQ says:
"Using four D2600 Disk Enclosures and 3TB drives, up to 192TB of raw user capacity is available for a single X5000 G2 Network Storage System."
That's a useful jump in capacity up from the previous 100TB maximum. ®
In my experience...
In the GPFS world, the most relevant thing is not what each individual storage subsystem can achieve - knowing what each individual LUN (often a raid-5 set in a subsystem along with several others) can do in a certain configuration helps with the design, though. It helps drive how many subsystems to use to get to a certain performance target. It helps to understand the costs, the heat output, things like that. It's not the primary driving factor in how you get high performance out of your storage system.
Overall system performance to a single file is a better measure of an HPC storage system. Other things like how many metadata operations a system can handle (that is, how quickly you can create or delete thousands or millions of files) is also important.
Back to the bandwidth "record" claims - for context, there are several production installations running over 100GB/s to a single file, and there have in fact been several for quite some time.
Here's a paper from Livermore from their 2006 exploits with GPFS showing 129GB/s (write) and 153GBs (read) against a required 122GB/s bandwidth requirement:
I imagine there are similar publications about lustre, although I've not been involved in those personally.
Getting excited about how many MB/s a single storage subsystem can handle is a bit like getting excited about how much coal you can shove into a single carriage of a train. It's technically interesting, but most people with a big requirement would simply put another carriage on the train.
It's all about picking the most appropriate criteria
Of course the fun thing about performance claims of any type is that it all depends on what your criteria are. The comparisons in this article are primarily focused on performance per rack which actually favors the highest density solutions more than it does the highest performance solutions. I have yet to hear a prospect say “what I really need is xx GB/s per rack” – instead they say “I need a specific amount of usable storage capacity that does at least xx GB/s” and then there’s a multi-vendor race to compete for the business.
At Panasas, for our "world's fastest" claim we have used the metric that we believe has the most relevance to customers: file system performance per disk – how much measurable delivered parallel file system performance across the network to the client is possible per 7.2K SATA drive in the system. For large file throughput workloads accessed by a typical HPC Linux cluster, our customers can reproducibly measure 1600MB/s write throughput from a single 20-drive ActiveStor 12 shelf using the open source IOR benchmark across an optimal network setup, with this result scaling near-linearly with additional shelves. That’s a full 80MB/s per SATA drive in delivered file system performance. To the best of our knowledge, that is still the world’s fastest and allows Panasas to offer more performance for any specific amount of capacity.
I think it's also worth noting that to the best of our knowledge, Xyratex or DDN have never published independent benchmarks proving their file system throughput numbers with the Lustre or GPFS file systems on top of their hardware. Panasas has provided independent third party validation to substantiate its claims (the ESG report that Chris' article referenced).
Ultimately though, what is much more important than performance for most real world deployments is ease of use/manageability along with superior reliability, availability and serviceability – these are all major strengths of Panasas ActiveStor.
Sr. Director of Product Marketing
There I was getting worried about storage bottlenecks in terapixel image processing applications.
What wonderful times we live in.