Seagate serves up three layer ClusterStor sandwich
Online disk archive for HPC and Big Data
Seagate has three new ClusterStor HPC arrays: the A200, G200 and L300. The A200 is a cold data archive array. The G200 is loaded with IBM’s Spectrum Scale parallel file system software, GPFS that was, while the L300 is an equivalent array running Lustre parallel file system software.
The existing ClusterStor range includes 1500, 6000 and 9000 products. The 1500 product centres on 4U scale-out storage units (SSUs) or building blocks, containing object storage servers, with integrated Lustre parallel file system and 2U ClusterStor management unit. Each can have three 4U expansion storage units. Performance starts at 1GB/sec and scales up past 100GB/sec.
The 6000 is for Big Data and HPC. It is a rack-scale system with larger, 5U, SSUs, each with up to 6GB/sec of read/write filesystem performance (36GB/rack with 6 x SSUs). Seagate’s 9000 is also rack-scale and designed for productivity-critical HPC and Big Data applications and offers up to 63GB/sec/rack.
Seagate now says the ClusterStor family includes L300, A200, G200, 9000, 1500, and the ClusterStor Secure Data Appliance, implying the 6000 has gone away.
How do the A200, G200 and L200 stack up against these existing systems?
ClusterStor L300 rack
The A200 is a single tier, active archive, object store for the ClusterStor product line. Customers can migrate static data from Lustre ClusterStor arrays and, in a future release, ClusterStor Spectrum Scale arrays.
It consists of five to seven scalable storage units per rack with additional storage units and racks added for expansion. The SSUs have SATA shingled magnetic recording (SMR) disk drives and can scale to a phenomenal 3.4 undecillion (2128) objects, with effectively unlimited object size and storage capacity.
Seagate says that, with SMR drives, readable tracks are narrower than originally written tracks, which results in 30-40 per cent more tracks being stored on the platter than standard Perpendicular Magnetic Recording drives using SAS technology.
There are two 6TB SAS metadata disk drives per SSU and the A200 provides 524 usable TB per tray, over 3.6 usable PB per rack (4.59PB raw). There is up to 10GB/sec throughput per rack, and dual 10GBitE connectivity. Add racks to scale performance.
The pre-configured ClusterStor A200 includes an automatic policy-driven hierarchical storage management (HSM) gateway system with data movers acting as Lustre clients and working against ore-defined policies. Migrated files are stubbed in their primary storage so they are still online.
According to Seagate, when a Lustre client/user attempts to use this file, it is automatically and transparently retrieved from the secondary storage tier before proceeding. Once a file is re-hydrated, it can again be stubbed per policy once it meets policy criteria.
There is a highly available ClusterStor management unit. The array implements Seagate’s 3rd generation, network-based erasure coding with up to 11 nines of durability. Failed disk rebuilds are carried out by the entire system, shortening rebuild time and a failed 8TB drive can be reconstructed in less than 60 minutes. It has four nines of availability, 99.99 per cent uptime.
Seagate says “the default supported CS A200 erasure coding schema is 8+2, meaning that objects are sharded into eight data shards and 2 parity chunks and written across 10 networked SSU’s. An A200 configuration containing ten or more SSUs maintains full data read availability even in the event of two concurrent SSU failures or two concurrent drive failures.”
This ClusterStor Lustre product is for productivity-critical HPC data storage requirements, Seagate saying it is an “engineered solution”, shades of Oracle, features “Cluster in a Box” integration of dual high availability Object Storage, Metadata and Management Servers. It is a pre-configured rack-level system with up to 112GB/sec bandwidth per expansion rack; 96GB/sec per base rack.
There is no single point of failure and the array has Intel Omni-Path, Mellanox EDR and100 Gbite network connectivity support. The array has Lustre Distributed Name Space (DNE) support of up to 18 metadata servers per file system.
The L300, and the G200, are the first arrays to deploy a ClusterStor HPC drive. This 3.5-inch drive stores up to 4TB, and features a sequential data rate at a surprisingly high 300MB/sec. Seagate says it has a 35 per cent faster random performance than any other 3.5-inch drive. No more details are available.
To us here at the El Reg storage desk it looks as if the L300 replaces the existing 6000 Lustre array.
The G200 is roughly speaking, a Spectrum Scale version of the L300 Lustre array, albeit much slower at up to 63GB/sec per rack. It holds up to 4.48PB per rack, with 4, 6 or 8TB SAS drives available, including, we understand, ClusterStor HPC disks.
There can be up to seven 5Ux 84 Storage Chassis Dual Network Shared Disks Servers, configured in a High Availability pair, per rack.
Dual-ported SAS SSDs are used for metadata management. Connectivity options are InfiniBand QDR (40Gbit/s) or FDR (56Gbit/s) , or Ethernet 10GE or 40GE. The L300 has EDR (100Gbit/s) InfiniBand available.
Seagate says it has a "unified modular architecture that integrates all of the IBM Spectrum Scale Network Shared Disk (NSD) user data and meta data functions into a single Seagate high-density disk enclosure with high-availability server pairs."
Systems are delivered assembled, configured and ready to deploy.
The G200 storage system units are delivered in two models; the first has 84 slots for 3.5” HDD and SSD in a 5U enclosure with embedded storage controllers.
The ClusterStor G200 Spectrum Scale will be available in Q1 2016. We understand the 8TB ClusterStor HPC Drive, and Ethernet Client Access support will be available in the first quarter of 2016. ClusterStor Manager (GUI) support will be available in the first half of 2016. ®