Original URL: http://www.theregister.co.uk/2009/07/24/file_storage_startup_myth/

Storage start-ups fail to set the world on fire

How IT fell for file storage growth myths

By Chris Mellor

Posted in Storage, 24th July 2009 12:31 GMT

Comment Try this point of view on for size: there is no general large scale file storage problem. Companies set up to deal with that problem have failed to set the world on fire and over-invested ones, like ONSTor and Copan, are facing difficulties.

Meanwhile, block storage SAN re-invention companies, such as 3PAR and Compellent have done better, showing up the lack of customer need for a panacea for the problem of having too many files and too many large files. The panacea isn't needed because the general problem doesn't exist.

It was not supposed to be like this. Several years ago, engineers, marketeers and entrepreneurs could see a file storage problem looming. The media industry's move from analogue to digital storage was going to create millions, or even billions, of image files, music files and movie files. E-mail use was spreading like a pandemic with overflowing mail boxes, and millions of attachments, many duplicated. Collaborative software like Lotus Domino and SharePoint was causing millions more files to be created.

There was a general and continual rise in the use of unstructured information that needed to be kept, just in case it was needed. It was persistent or reference information and it was held in filer silos, hundreds of them sometimes, located across enterprises, with no co-ordination and no consistent way to search for content. The compliance and eDiscovery dynamics were, and are, often used to strengthen the supposed customer need for these products.

Storage was split into direct-attached storage (DAS) for blocks and files, networked-attached storage (NAS) for files, and storage area networks (SANs) for blocks. SANs were beginning to virtualise the physical storage but there was nothing like that for file storage, NAS being far less consolidated than SANs.

The entrepreneurs, developers and engineers looked at this and saw OPPORTUNITY written large. They started up projects inside storage companies, and even started up new storage companies, to create the next killer storage product. The one that would kick the file storage problem into touch.

Their responses to the problem were different, but hindsight says they all made the mistake of assuming that the problem was larger than it actually turned out to be.

Four file storage problem groups

First was spin-down. Copan and Nexsan and others thought the way to make file storage less onerous was to spin down disk drives and work with a pair of supporting dynamics. One was coming power shortages in metropolitan areas, combined with environmental carbon emission-cutting thinking. The other was data centre space limitations. If you cut electricity use by spinning down disk drives and pack the drives better, then you reduce power draw and space needs simultaneously. You can store more files in less data centre floorspace and need less power. It was a triple whammy win that could not fail.

Secondly, there was file virtualisation. You interpose a special server box between application servers and the multifarious file stores and have all the files in all your file stores represented in this one box inside a global namespace. You virtualise the file stores, so it looks like there is just one file storage universe which app servers can tap into. Acopia and Rainfinity and FilesX tried this route to bring sense to the file storage horror story.

Thirdly, general archival storage boxes sprang up, with EMC's Centera being the obvious one. Others are also plugging away at this space: Caringo, Mimosa, and Waterford. Plasmon tried - and died - here too. Lots of people saw that Centera was sky high in price and as proprietary as you like, and came up with commodity hardware/open software alternatives. None of them toppled Centera from its throne because they weren't good enough, and there wasn't a sufficiently general problem to prompt widespread adoption of their products.

Instead, specific archival storage products - ones focussed on e-mail or SharePoint - have survived and are developing into general archival products, with good compliance and e-Discovery functions. There is a developing market for these products, but it's not as large or as widespread as early product developers hoped.

Fourthly, we saw the development of scale-out filers, often using some form of clustering, to solve the problem of serving very large numbers of files, often large files, to a set of servers simultaneously. The large files would be split across several filers or sub-files with parts served in parallel. Ibrix developed software for this. BlueArc developed FPGA hardware-accelerated super-NAS products. Isilon, Exanet, and ONStor developed clustered filer hardware and software. Again, there is a real problem here but customer interest turned out to be concentrated in two areas and not be general.

Digital movie effects meant that rendering scenes needed massive file delivery horsepower and that benefitted Isilon, Ibrix and BlueArc. It also benefitted some block storage suppliers, like Data Direct Networks but we'll ignore those here because this is a file-focussed story.

High-performance computing (HPC) and supercomputing also needed the same sort of massive filer bandwidth to cope with seismic, simulation and genome-type data. However general business did not.

ESG's Steve Duplessie points out that Web 2.0 companies like Amazon, Google and Yahoo also had an internal need for scale-out filers, and sometimes built their own infrastructure for this in a massively impressive way. It didn't generally benefit our scale-out NAS startups, though, and was specific to these massive-scale Internet-based service suppliers, not to everyday business.

Not a general problem

What these four groups of startups didn't realise was that the generality of businesses would be content - not happy maybe, but content - to carry on as before. NAS boxes grew to hold more data and it was easier to carry on with existing file storage processes than to bring in new and untried ideas like, Copan's MAID, Isilon's clusters or Acopia's file virtualisation or whatever.

They also over-stated the influence of compliance and e-Discovery. The sell here was basically one based on scaremongering: look how much customer X got fined because they couldn't find a file in time! Not enough people were frightened into buying these start-ups' products, and the existing filer vendors did enough to assuage the compliance and e-Discovery concerns of their customer base.

The quartet of file storage problem solving startups then met other problems. Sub-file level deduplication removed a prime reason for having their products, especially when they worked within the existing backup software and process infrastructure: witness the success of Data Domain. Specialised dedupe vendors built and developed products, and grew their companies faster than the group of four file storage problem solvers we are looking at.

In the last couple of years, EMC and IDC have done a fine job of alerting everyone to the storage consequences for the digitisation of media, of social interactions and increasingly of intelligent device communications. The file storage problem pioneers are being proved right, but it is not doing them much good.

They came into existence on the back of an exaggerated problem. Because of this, they weren't generally able to capitalise on the limited demand for their products and build sustainable businesses before the major storage vendors did enough to catch-up and take their market prospects away. That old consultant's mantra of get big, get niche, or get out (or bought) applies here.

None of them have got big. File virtualisation is a dud. Persistent storage is a nice feature to have with an archive, but of limited appeal on its own. Clustered scale-out filing is a niche, with major suppliers ready to capitalise on it if it becomes mainstream. Archive is a niche with cloud storage positioned as a probably valid alternative storage choice to whatever drive arrays you choose.

This piece started with the statement that there is no general large scale file storage problem. I'd contend that there still isn't. The problem area is bigger than before but there's no sign yet that business in general needs a massively scalable, logically single file store, with a global namespace, compliance and e-Discovery facilities based on the greenest possible drive arrays.

Some businesses need some of this. Not enough businesses need all of it, and that's why not one of the file storage problem start-up companies mentioned here have become successful and major suppliers in their own right. ®