Cleversafe: Our mad rig can gobble A TERABYTE in A SECOND
Big Data? This is Giant Fat Chair-smashing Bastard Data
Magic Quadrant for Enterprise Backup/Recovery
Cleversafe claims it has the biggest mouth for objects on the planet, gulping them in at a terabyte a second.
The company is an object storage startup and its 3000 series of appliances can cram in a terabyte's worth of objects but you need a thousand, yes indeed, 1,000, ingest boxes or Accesser 3100 storage router appliances in front if a 10EB storage cloud to do it.
The Accessers read and write objects into the object storage namespace independently and each one sucks in data at a rate of 1GB/sec.
A 10EB storage cloud is not just a Big Data installation; it is an effing gigantic data installation comprising:-
- 16 sites
- 24 portable datacenters per site (384 total)
- 21 racks per portable datacenter (8,064 total)
- 147 storage nodes (Slicestors) per portable datacenter (56,448 total)
- 84 3TB drive per Slicestor (4.7M drives total)
- 1000 Accesser 3100 appliances (1GB/s per appliance, 1TB/s total ingest)
- 96 dsNet Manager 3100 appliances (each manages 100PB of infrastructure)
This is steroidal, pump-up-the-numbers marketing. As if anybody would ever put 4.7 million disk drives in 56,448 Slicestors in 8,060 racks in 16 datacentres. It's hugely impressive - but like a theoretical orgasm for object storage buffs.
A 1TB/sec ingest rate is similar to high-performance computing data ingest rates - DataDirect Network's SFA12ke claims a 20GB/sec file ingest rate per rack - but this isn't an HPC situation, not when spread across 16 sites. It basically seems to take quite a lot of time to ingest and store unstructured information into the Cleversafe object store, as each object needs to have a hash calculated; there's a fair amount of processing involved.
The object storage vendors say their technology is better at storing billions of files across dispersed sites than a traditional file system. When we see one of them doing just that we will all be impressed. Until then it's just modelling and the real thing is always better than a model. ®
COMMENTS
Tens
Tens of petabytes in a single production system, definitely real today with several Cleversafe customers (first hand knowledge). Tens of exabytes in production, not that far off given the growth of data and the need to keep more available and online for analysis. Challenge is to make sure you have an architecture and a platform that can get there when your organization needs you to be there.
DataDomain?
I work for EMC, trust me, the sort of workload this science project fantasy is talking about would hardly be put onto a (100's of) DataDomain(s). Atmos could scale to these numbers and beyond though.
Speculating is fun though, 60EB is about 1/30 of the total amount of digital content generated in 2011 from what IDC estimates, so go ahead and triple those numbers and there is some room for growth :)

IT infrastructure monitoring strategies
Requirements Checklist for Choosing a Cloud Backup and Recovery Service Provider
Data control in the cloud
Cloud based data management
Enabling efficient data center monitoring