CERN turns to Seagate’s Kinetic system and says ‘it’s storage time’
Boffins may need to expend energy on software issues
Comment CERN, with its extremely high-tech, bleeding-edge Big Data wizardry, is waved around like a trophy by IT suppliers these days. Now Seagate has stepped up onto the CERN stage, wanting to get its Kinetic disk drives used to store Large Hadron Collider (LHC) data.
Seagate has gone and signed a three-year deal with CERN to scoop some of that LHC glamour and "to collaborate on the development of the Seagate Kinetic Open Storage platform".
The 4TB, 4-platter Kinetic drives store objects and are directly addressed using Ethernet, meaning interfacing software has to know about them. You can't just slot them in drive arrays and expect them to work.
The benefit is for large data stores which can get rid of storage array controllers and complex storage I/O stacks, and so simplify and speed disk I/O operations.
However, software has to be written to use them, and there's the rub. It takes time to craft enterprise-class reliable storage software, particularly object-oriented storage accessing application software. So far, there has only been trivial support, with indications from AOL, Digital Sense and HP in November last year.
HP hasn't announced any Kinetic drive-based products or services, so far as we know.
EVault is supporting the drives, but then it is a Seagate subsidiary. Scality may be developing support as well. All in all, it's not exactly a flood of customers supporting the drives.
The situation isn't helped by the drives being proprietary and effectively single source, meaning any users are locked into Seagate as a supplier. Seagate has open-sourced the object access API but, since you can only get the drives themselves from Seagate, this doesn't void the lock-in situation.
CERN has produced 100PB of LHC data and more is being created at a 2PB-3PB per month rate, all "in its quest to further humanity’s understanding of the universe". How noble, and what a good piece of spin as background for a disk deal.
A CERN quote talked about trying to reduce complexity and operational costs in its storage systems, meaning that someone or some team has to produce the object-oriented application software needed.
The CERN openlab "provides companies with a framework to test and validate cutting-edge information technologies and services in partnership with CERN".
We're told a second, future research project between Seagate and CERN will look at CERN’s EOS storage system to determine whether there are opportunities to enhance and improve the system. CERN says: "EOS is a disk-based service providing a low-latency storage infrastructure for physics users. EOS provides a highly scalable hierarchical namespace implementation. Data access is provided by the XROOT protocol."
Were that to use Kinetic drives, then it would need a Kinetic drive I/O layer interposed between it and the drive hardware with the user-access code layer staying unchanged.
CERN data centre
The beauty of a storage array or VSAN or VSA is that familiar storage access protocols and stacks are used, so that app software doesn't have to change.
A drive array is a drive array is a drive array. You can stick in different disk drive types quite happily as long as they look like a standard drive. So HGST's Helium drives can be used in drive arrays with no change to upper level SW.
Not so with Kinetic drives: the whole storage I/O stack has to be rewritten and it looks like each customer has to craft its own system software to do this, what with there being no standard Kinetic disk drive array controller software block.
On this understanding, making a success of Kinetic drives by Seagate will be a hard, difficult and multi-year undertaking, with no guarantee of success. ®