Linux takes obscurity route to datacentre

SAN castles...

Yes, OK. We know you're busy people. But our passion for bringing obscure Linux file system announcements to your attention remains undiminished. Especially when they're important, and this one is certainly worth keeping tabs on.

Last week Sistina released version 5.0 of the open source GFS clustered file system. It richly deserves its obscurity rating - any FAQ that begins with the question "Is STOMITH absolutely required?" can be judged to have a small but intense following.

But Matt O'Keefe's GFS long predates Sistina, the company he founded to commercialize the project he began at the University of Minnesota. GFS not only has the potential to unseat the leading vendor in its class - Veritas - but to bring open source commoditisation to servers and storage.

Hardware storage vendors insist that the expensive magic belongs in the box. And up to a point they're right - EMC's high end storage is based on fabulously well-tuned, specialist servers. But there's a school of thought that suggests that given the right software, there's nothing that a room full of white box PCs can't do almost as well. This school maintains that storage is essentially a software problem, the problem being massaging a load of PCs to act and behave as a single SAN. EMC can justifiably contest this argument, of course, but that hasn't stopped people from trying to find a commodity software answer to SANs. And so clustered file systems have long been touted as the key that opens the door.

The debate has been maintained in the Linux community who see clustered file systems as paving the way for The Final Victory. First the invisible infrastructure turned to open source, more BSD than Linux here admittedly - in the binds and routing services. Then the edge infastructure followed, running the web servers. That invites Linux in as a development platform, which tempts real grown-up application vendors like Oracle to suggest Linux is a viable platform. To succeed there, and in storage, it needs to behave as respectfully as a Solaris however, and that's been singularly lacking. And Network Appliance has scooped up much of this business very neatly: it's Linux friendly but ensures the file system protocols favor NetApp kit. So it can claim the best of both worlds.

Basically, Sistina tries to do what Veritas does - provide a clustered file system and lock manager for applications - only cheaper, and on Linux. More recently GFS has got more attention recently for all the wrong reasons. At LinuxWorld in August, Sistina announced that it would no longer be available under GPL. But it remains the most used Linux clustered file system, despite several academic rival projects (Andrew and Coda) and recent competition from Compaq's decision to open source its Non Stop Cluster work.

GFS Version 5.0 adds some SSI (single system image) functionality and the ability to split mirrored backups to the really high end storage from EMC and Hitachi (which is licensed by Sun and HP). How extensive this is compared to Veritas or NSC we're not in a position to judge, but we'd welcome experiences from the field. What we would like to know is how far a Linux-centric clustered file system matches up in cost and performance to the propietary alternatives, and whether a non-free file system is judged to be trustworthy. Or just a low-budget Veritas. Then we'll have a clearer idea of how far Linux is from storming the data center. ®

