Feeds

Understanding what's going on in storage arrays is like doing MAGIC

Software defined storage? It's a load of bollocks

Internet Security Threat Report 2014

Storagebod As the announcements and acquisitions which fall into the realms of Software Defined Storage – or "storage", as I like to call it – continue to come, one starts to ponder how this is all going to work in the real world.

I think it is extremely important to remember that you are going to need hardware to run this software on. While current trends show this moving towards a commodity model, there are going to be subtle differences that are going to need accounting for. And as we move down this track, there is going to be a real focus on understanding workloads and the impact of different infrastructure and infrastructure patterns on this.

I am seeing more and more products that enable DAS to work as a shared-storage resource, removing the SAN from the infrastructure and reducing the complexity. I am going to argue that this does not necessarily remove complexity but it shifts it. In fact, it doesn't remove the SAN at all – it just changes it.

It is not uncommon now to see storage vendor presentations that show Shared-Nothing-Cluster architectures in some form or another; often these are software and hardware "packaged" solutions. Yet as end-users start to demand the ability to deploy on their own hardware, this brings a whole new world of unknown behaviours into play.

Once vendors relinquish control of the underlying infrastructure, the software is going to have to become a lot more intelligent, and the end-user implementation teams are going to have to start thinking more like the hardware teams in vendors.

For example, the East-West traffic models in your data-centre become even more important. You might find yourself implementing low-latency storage networks; your new SAN is no longer a North-South model but Server-Server (East-West). This is something that the virtualisation guys have been dealing with for some time.

Then there's understanding performance and failure domains: do you protect the local DAS with RAID or move to a distributed RAIN (Redundant Array of Independent Nodes) model? If you do something like aggregate the storage on your compute farm into one big pool, what is the impact if one node in the compute farm starts to come under load? Can it impact the performance of the whole pool?

Anyone who has worked with any kind of distributed storage model will tell you that a slow performing node – or a failing node – can have impacts which far exceed what you'd have believed possible. At times, it can feel like the good old days of token ring, where a single misconfigured interface can kill the performance for everyone. Forget about the impact of a duplicate IP address – that's nothing compared to this.

So what is the impact of the failure of a single compute/storage node? What about if multiple compute/storage nodes go down?

In the past, this was all handled by the storage hardware vendor and, pretty much invisibly, at implementation phase by the local storage team. But you will need now to make decisions about how data is protected and understand the impact of replication.

In theory, you want your data as close to the processing as you can get it, but data has weight and persistence; it will have to move. Or do you come up with a method that allows a dynamic infrastructure that identifies where data is located and spins/moves the compute to it?

The vendors are going to have to improve their instrumentation as well. Let me tell you from experience that understanding what is going on in such environments is deep magic. Also the software's ability to cope with the differing capabilities and vagaries of a large-scale commodity infrastructure is going to be have to be a lot more robust than it is today.

Yet I see a lot of activity from vendors, open-source and closed-source, and I see a lot of interest from the large storage consumers. This all points towards a large prize to be won. But I'm expecting to see a lot of people fall by the road.

It's an interesting time to be in storage... ®

Beginner's guide to SSL certificates

More from The Register

next story
Docker's app containers are coming to Windows Server, says Microsoft
MS chases app deployment speeds already enjoyed by Linux devs
'Hmm, why CAN'T I run a water pipe through that rack of media servers?'
Leaving Las Vegas for Armenia kludging and Dubai dune bashing
'Urika': Cray unveils new 1,500-core big data crunching monster
6TB of DRAM, 38TB of SSD flash and 120TB of disk storage
Facebook slurps 'paste sites' for STOLEN passwords, sprinkles on hash and salt
Zuck's ad empire DOESN'T see details in plain text. Phew!
SDI wars: WTF is software defined infrastructure?
This time we play for ALL the marbles
Windows 10: Forget Cloudobile, put Security and Privacy First
But - dammit - It would be insane to say 'don't collect, because NSA'
Oracle hires former SAP exec for cloudy push
'We know Larry said cloud was gibberish, and insane, and idiotic, but...'
Symantec backs out of Backup Exec: Plans to can appliance in Jan
Will still provide support to existing customers
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
Win a year’s supply of chocolate
There is no techie angle to this competition so we're not going to pretend there is, but everyone loves chocolate so who cares.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Intelligent flash storage arrays
Tegile Intelligent Storage Arrays with IntelliFlash helps IT boost storage utilization and effciency while delivering unmatched storage savings and performance.