Storage is boring, right?
Understanding the Sherpa layer of IT
Some ‘information assets’ are more equal than others
Responding to these needs requires more than buying in a bunch of disks. For a start, storage arrays tend to fall into one of two categories – “high-end” for very expensive, high-performance disks, and “mid-range” for the rest (you don’t hear of “low-end” storage arrays). And, more recently, a third category of disk has emerged, namely solid state disk (SSD). While SSDs might suggest the beginning of the end for spinning disks, they are currently still more expensive than disks – they do have performance characteristics exceeding even the fastest high-end disks, however.
Given these options, deciding what data should go where is quite a skill, particularly as data characteristics change over time. A well-managed tiered storage set-up would match the ratio of high and lower cost storage in use, with the requirements of the data at any given time.
An additional dimension is to ensure appropriate storage and data availability. These are generally achieved by placing a copy of the information in some other, preferably safe, place either online (for example, via a second array to which all data is replicated) or offline (for example on tape, in a fire-proof safe or at an off-site storage facility).
Deciding between all the options and coming up with an appropriately architected storage environment is not trivial. Neither are things standing still – not only are storage technologies evolving, but so too are other areas of IT, upon which storage depends. It’s worth homing in on a few developments, to illustrate the point.
Not least of course, we have virtualisation. Server virtualisation may be the buzz-phrase right now, but its growth places new demands on how storage is built and delivered; the ease at which new servers and whole systems can be provisioned increases the risk of storage bottlenecks, as does the accompanying fluctuation in demand.
To counter this, we have of course storage virtualisation – which enables storage resources to be treated as a single pool, and then provisioned as appropriate. The phrase in vogue at the moment is ‘thin provisioning’, in which a server or application may think it has been allocated a certain disk volume, but in fact the storage array only allocates the physical storage required up to the specified maximum (which may never be reached). This makes for a far more efficient use of storage.
Speaking of efficiency, another trendy term is de-duplication, in which only variations in files or disk blocks are retained, transferred, backed up or whatever, rather than – ahem – duplicating everything. For the non-initiated, this one does sound a bit of a no-brainer – but the fact is that de-duplication can get quite complicated. An index needs to be maintained of everything that is being stored or backed up, so that files can be ‘reconstructed’ as necessary for example.
Other storage related developments worthy of note are on the networking side with iSCSI (bringing together block and file storage, aka storage for databases and unstructured content respectively), and at the higher end, the merging of data and storage networking using 10 Gigabit Ethernet. Meanwhile, in storage software, ‘merger talks’ continue between backup and archiving (they’re both about data movement after all).
Taking an architectural view of storage
All such developments share a common theme, that of convergence: with a following wind and from an interoperability perspective at least, things should get a bit simpler. But making the most of them still requires an architectural overview of the storage environment as a whole.
If you haven’t got one, where should you start? As we have already acknowledged, few organisations have the luxury of starting from scratch. But if you can build a clear picture of your storage requirements (remembering not all information is created equal – the 80/20 rule can be used here), together with a map of what capabilities you already have in place, you’re already halfway there.
From here, the next step is to produce the map of how you’d like your storage capabilities to look, based on your information needs and your broader infrastructure plans, for example vis-à-vis any intentions you might have around virtualisation. Given the plethora of options, each of which comes at a cost, storage planning will always be a compromise between functionality and affordability.
As a result it is worth thinking about how storage technologies might be used in tandem. For example, while it still isn’t cheap (though it’s getting cheaper), implementing de-duplication might provide immediate savings in terms of bandwidth and latency reduction. However, by considering its impact a little more broadly, the bandwidth savings may now allow (for example) data replication to another site, making disaster recovery possible whereas in the past it was not.
Storage then, is like a team of Sherpas; it does the heavy lifting so the rest of IT can make its way up the mountain without having to worry about the provisions. But it needs to be designed for the long haul, if it is to deliver on the value it promises. This may be difficult to do. But it is anything but boring.
Sponsored: Hyper-scale data management