GFS and open source clustering
Dumb-down Reg makes storage too simple
Many readers found our birds eye view of the storage industry too simple to be useful. Here are a couple of constructive comments.
We should point out too that our the article left the issue of whether Sistina's GFS is open source fuzzy.
After a chat with Matt O'Keefe, founder of Sistina, it's clear that it's not. O'Keefe says the closed source model Sistina adopted in August was the result of OEM and customer demand as much as the need to make a dime. It merits a full follow-up in short order, but here are your impressions:-
Basically, Sistina tries to do what Veritas does - provide a clustered file system and lock manager for applications - only cheaper, and on Linux. More recently GFS has got more attention recently for all the
wrong reasons. At LinuxWorld in August, Sistina announced that it would no longer be available under GPL. But it remains the most used Linux clustered file system, despite several academic rival projects (Andrew
and Coda) and recent competition from Compaq's decision to open source its Non Stop Cluster work."
Andrew and Coda are what might be called "network" file systems (although that name, without the quotes around "network", refers, of course, to a *particular* "network" file system) - data is stored on a
server, and the server takes responsibility for the data. The same is true of NFS, SMB/CIFS, and the OSF's DFS. Andrew and Coda *do* have the notion of files being "cached" on the client (so do NFS and CIFS - or,
rather, they have the notion of individual blocks of a file being cached on the client, with CIFS and NFSv4, unlike NFSv2 and v3, having mechanisms to tell a client that some other client is using the file and that the client should no longer keep the data to itself).
GFS, as I understand it, does all the file system work on the clients, with the "servers" acting as dumb disks.
And Network Appliance has scooped up much of this business very neatly: it's Linux friendly but ensures the file system protocols favor NetApp kit. So it can claim the best of both worlds.
We're "Linux-friendly" in the same sense that we're "Solaris-friendly" and "HP-UX-friendly" and "AIX-friendly" and "BSD-friendly" and so on -
we act as an NFS server, and all the various flavors of UNIX (including reimplementations such as Linux) can act as NFS clients. (We're also "Windows-friendly" in that we also act as a SMB/CIFS server; of course,
Linux has had its own SMB/CIFS client file system for a while, and FreeBSD also got one recently, as did MacOS X, so if you want to use a "network" file system designed for Windows on your UNIX box, for some
reason, you can do that.)
I'm not sure what "ensures the file system protocols favor NetApp kit" means here; we didn't invent the file system protocols we support (NFS and SMB/CIFS), although we have participated in the groups designing
NFSv3 and NFSv4 and in the SNIA group producing a CIFS standard. We think our *implementations* of those protocols - and our underlying file system - have some advantages over other implementations and file
systems, but that's another matter, and we *do* offer other services with protocols of our own (some of which, e.g. NDMP, are published and available for others to implement).
Also, note that GFS can use SAN storage devices as "dumb disks", so it's not as if GFS necessarily displaces large RAID boxes on a SAN.
(Also, although GFS might not longer be available under the GPL, there's a project to continue to develop the GPLed version, OpenGFS
A Storage Area Network is composed of several levels from the disk level to the Fibre Channel fabric to the Operating Systems supported and _then_ you have storage virtualization software such as Veritas or GFS. (Note that Compaq has a SAN / storage virtualization offering that is totally server independent eg, http://www.compaq.com/products/storageworks/enterprise/index.html ).
I guess I am just nit-picking here but your article made storage area networks sound like an OS issue. In my opinion storage virtualization is (or should be) completely independant of the servers which use the storage.
It is nice that there is a full fledged storage virtualization package available for Linux but your article didn't illuminate why or where it should be used (other than it's availability for an open source OS).
It seems to me that you are lumping NAS, Clustered File Systems, and SAN's together and trying to compare virtualization offerings at different levels. In my view a very apples - oranges attempt.
As for "the ability to split mirrored backups to ... storage". With Compaq SAN offerings you can "clone" or "snapshot" existing raidsets (or volumes, if you prefer) by issuing commands directly to the storage controller with no OS intervention via a management appliance.
I guess I would have been happier with your article if it had clearly differentiated between the SAN (as storage) and the virtualization software providing specific functionality.
[name and address supplied]®
Sponsored: Magic Quadrant for Client Management Tools