Feeds

Perish the fault! Can your storage array take a bullet AND LIVE?

Sysadmin Trevor's gentle guide to protecting your data - and your career

HP ProLiant Gen8: Integrated lifecycle automation

Get this right and you'll be singing in the RAIN

RAIN is a redundant (or reliable) array of inexpensive Nodes. For a brilliant explanation I direct you to this video by Gene Fay of Nine Technology. Short version: RAIN copies your data across multiple individual computers for redundancy.

A server rack full of storage nodes

Seize the RAINs, keep your servers' data protected

There are many different implementations of RAIN out there today; this is a large part of what the kerfuffle over "big data" is all about. When you have conversations about HDFS, GlusterFS or Amazon's S3 you are talking about RAIN. In general, RAIN setups don't work like traditional file systems, although the Gluster team is building tech on top of GlusterFS that seeks to change this.

With most RAIN setups, your operating system doesn't mount them, you don't create NFS or SMB shares. If you really want to do those types of activities you need to be using virtual disks on top of the RAIN array using something like FUSE. At this point you're way out in the weeds and you should probably be reassessing the whole project. Still, if you really want to, you can be bizarre and run VMware virtual machines on Gluster via an NFS server translator.

While you can throw layers of translation of top of a RAIN setup in order to make it pretend to be a traditional disk, RAIN is generally for object (not file) storage. It's better to think of RAIN setups as really big databases rather than traditional file systems.

Bulletproof clusters

Of course, if ZFS or RAID underpin your storage layer, then what happens if I shoot the storage server? RAIN would seem to be resilient to the loss of an individual system, but there's nothing native to ZFS or RAID to deal with a bullet through the CPU.

This is where clustering comes in. An ideal deployment for fault tolerance would have two servers in bit-for-bit lock-step. In the free software world you are looking at DRBD with Linux or HAST with FreeBSD.

Assuming you have a solid hardware RAID underneath, Microsoft's Server 2012 is actually the basis for a very reliable cluster. Cluster Shared Volumes v2 is how I get my RAID 61: hardware RAID 6 on each node, mirrored. (I turn write caching off in order to ensure that I don't lose data in memory if a node dies. Slower, but safer.)

Combine that with Server 2012's new NFS 4.1 server, the iSCSI target or SMB 3.0 (which supports multichannel, transparent failover and node fault tolerance) and I can shoot one of my Microsoft servers without the VMware cluster that uses them for storage knowing anything's happened.

Speaking of VMware, they offer the vSphere Storage Appliance. It is a reliable technology for creating a storage cluster, however it only scales to three physical systems per storage appliance.

It's all rather a mess right now, isn't it?

If you are starting to sense some holes in feature availability here, you aren't alone. This is why storage vendors exist as separate entities. Honest-to-$deity fault-tolerant storage with open-source tools is an absolute pig to implement and Microsoft needs time to get all its technology ducks in a row. (It needs triple disk redundancy with ReFS on Cluster Shared Volumes scaling to hundreds of nodes before it is a real player.) VMware has the basic technology but it needs to scale quite a bit more before they are a real consideration.

This is why there are so many storage startups out there. It is also why the storage giants can still sell those big, expensive SANs. There is a lot to consider when planning your storage today, even if it is only for a single server. What you knew ten years ago doesn't really apply any more. What you knew five years ago is probably just enough to get you into trouble.

Of course, these technologies are for fault tolerance only. Fault tolerance is not a backup. If your data doesn't exist in at least two physical locations, then your data does not exist; make sure that on top of utilising the fault tolerant technologies discussed above that you have a proper backup plan. And remember: a fault tolerant system (or a backup) that hasn't been tested isn't any form of protection at all. ®

Eight steps to building an HP BladeSystem

More from The Register

next story
THUD! WD plonks down SIX TERABYTE 'consumer NAS' fatboy
Now that's a LOT of porn or pirated movies. Or, you know, other consumer stuff
EU's top data cops to meet Google, Microsoft et al over 'right to be forgotten'
Plan to hammer out 'coherent' guidelines. Good luck chaps!
US judge: YES, cops or feds so can slurp an ENTIRE Gmail account
Crooks don't have folders labelled 'drug records', opines NY beak
Manic malware Mayhem spreads through Linux, FreeBSD web servers
And how Google could cripple infection rate in a second
FLAPE – the next BIG THING in storage
Find cold data with flash, transmit it from tape
Seagate chances ARM with NAS boxes for the SOHO crowd
There's an Atom-powered offering, too
prev story

Whitepapers

Seven Steps to Software Security
Seven practical steps you can begin to take today to secure your applications and prevent the damages a successful cyber-attack can cause.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
Designing a Defense for Mobile Applications
Learn about the various considerations for defending mobile applications - from the application architecture itself to the myriad testing technologies.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.
Consolidation: the foundation for IT and business transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.