NetApp StorageGRID now on cloud nine: Would you put 100PBs in it?
the money an actual installation
Analysis NetApp continues its slow development of its object storage capability by adding a cloud interface to it.
Version 9.0 of StorageGRID, the storage software gained by NetApp when it bought Bycast in mid-2010, sounds like a major release but the only feature NetApp emphasises is the new Cloud Data Management Interface (CDMI).
CDMI is an SNIA standard for self-provisioning, administering and accessing cloud storage. Other StorageGRID access methods include NFS, CIFS, and an API using RESTful HTTP. Object storage vendors often provide proprietary APIs, and CDMI is a more open approach.
NetApp is the only public supporter of it so far. Competing object storage suppliers Amplidata, EMC Atmos/Centera, Caringo, Dell, HDS and Scality haven't signed up yet, but momentum for more companies to support CDMI is thought to be building.
StorageGRID is twinned with NetApp's E-Series arrays to provide a distributed content repository that provides the usual impressive-sounding big data object storage stuff - it claims to scale to hundreds of petabytes in a single namespace covering billions of files across hundreds of sites.
The El Reg storage desk is cynical because no company is actually storing hundreds of petabytes of data in billions of files inside a single namespace across hundreds of data centres. Until somebody actually does this the promise of large scale object storage is just that, a theoretical promise. Yes, it may well scale, and probably is real, but the technology appears to be available well in advance of the market need.
A NetApp Data Bingo blog article states: "Traditional file systems and access methods were not designed to store hundreds of millions or billions of files in a single namespace. This leads to admins storing data in multiple file systems, multiple shares, complex directory structures – not because the data should be logically organized in that way, but simply because of limitations in file system architectures. This issue becomes even more pressing when data sits in multiple locations, maybe even across on-premise and off-premise, cloud-based storage."
That's all very well but the problems aren't big enough - yet - to prompt a general transition to object storage technologies.
NetApp mentions customer Iron Mountain in the release of StorageGRID version 9. This was a Bycast customer and NetApp hasn't been able to, or chose not to, focus on any other customer or make claims about new customers for StorageGRID.
What object storage needs is a thumping great demo, a real implementation of the billions of files, hundreds of PB, single namespace, hundreds of sites idea that blows traditional file system access away. Why doesn't one exist? It would be horribly expensive and complex to set up and, The Reg suspects, unrealistic. After all CERN chose tape to store its online Large Hadron Collider experimental data. Why not object storage? Too expensive maybe?
If object storage isn't suitable for real-world use cases involving hundreds of petabytes, seemingly ideal object storage use cases, then perhaps it's marketing its technology with the wrong message. ®
Comments to The Reg forum, please.
Sponsored: Hyper-scale data management