Feeds

Big Data is getting too damn big - and nobody is helping to fix this

See that nettle? Time to pop your gardening gloves on, chaps

Internet Security Threat Report 2014

Storagebod As vendors race to be better, faster and to differentiate themselves in an already busy marketplace, the real needs of the storage teams can be left unmet - and also those of the storage consumer. At times it is as if the various vendors are building dragsters, calling them family saloons and hoping that nobody notices. The problems that I blogged about when I started out blogging seem still mostly unsolved.

Management

Storage management at scale is still problematic; it is still extremely hard to find a toolset that will allow a busy team to be able to assess health, performance, supportability and capacity at a glance. Still, too many teams are using spreadsheets and manually maintained records to manage their storage.

Tools which allow end-to-end management of an infrastructure from rust to silicon and all parts in-between still don’t exist or if they do, they come with large price-tags which invariably do not have a real ROI or a realistic implementation strategy.

As we build more silos in the storage-infrastructure, getting a view of the whole estate is harder now than ever. Multi-vendor management tools are in general lacking in capability with many vendors using subtle changes to inflict damage on the competing management tools.

Mobility

Data mobility across tiers where those tiers are spread across multiple vendors is hard; applications are generally not currently architected to encapsulate this functionality in their non-functional specifications. And many vendors don’t want you to be able to move data between their devices and competitors' ones - for obvious reasons.

But surely the most blinkered flash start-up must realise that this needs to be addressed; it is going to be an unusual company which will put all of its data onto flash.

Of course this is not just a problem for the start-ups but it could be a major barrier for adoption and is one of the hardest hurdles to overcome.

Scaling

Although we have scale-out and scale-up solutions, scaling is a problem. Yes, we can scale to what appears to be almost limitless size these days but the process of scaling brings problems. Adding additional capacity is relatively simple; rebalancing performance to effectively use that capacity is not so easy. If you don’t rebalance, you risk hotspots and even under-utilisation.

It requires careful planning and timing even with tools; it means understanding the underlying performance characteristics and requirements of your applications. And with some of the newer architectures that are storing metadata and de-duping, this appears to be a challenge to vendors. Ask questions of vendors as to why they are limited to a number of nodes; there will sheepish shuffling of feet and alternative methods of federating a number of arrays into one logical entity will quickly come into play.

And then mobility between arrays becomes an issue to be addressed.

Deterministic Performance

As arrays get larger, more workloads get consolidated onto a single array - and without the ability to isolate workloads or guarantee performance, the risk of bad and noisy neighbours increases. Few vendors have yet grasped the nettle of QoS and still fewer developers actually understand what their performance characteristics and requirements are.

Data Growth

Despite all efforts to curtail this, we store ever larger amounts of data. We need an industry-wide initiative to look at how we can better curate and manage data. And yet if we solve the problems above, the growth issue will simply get worse ... as we reduce the friction and the management overhead, we’ll simply consume more and more.

Perhaps the vendors should be concentrating on making it harder and even more expensive to store data. It might be the only way to slow down the inexorable demand for ever more storage. Still, that’s not really in their interest.

Sometimes one does wonder why all these problems persist ... ®

Beginner's guide to SSL certificates

More from The Register

next story
Docker's app containers are coming to Windows Server, says Microsoft
MS chases app deployment speeds already enjoyed by Linux devs
'Hmm, why CAN'T I run a water pipe through that rack of media servers?'
Leaving Las Vegas for Armenia kludging and Dubai dune bashing
'Urika': Cray unveils new 1,500-core big data crunching monster
6TB of DRAM, 38TB of SSD flash and 120TB of disk storage
Facebook slurps 'paste sites' for STOLEN passwords, sprinkles on hash and salt
Zuck's ad empire DOESN'T see details in plain text. Phew!
SDI wars: WTF is software defined infrastructure?
This time we play for ALL the marbles
Windows 10: Forget Cloudobile, put Security and Privacy First
But - dammit - It would be insane to say 'don't collect, because NSA'
Oracle hires former SAP exec for cloudy push
'We know Larry said cloud was gibberish, and insane, and idiotic, but...'
Symantec backs out of Backup Exec: Plans to can appliance in Jan
Will still provide support to existing customers
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
Win a year’s supply of chocolate
There is no techie angle to this competition so we're not going to pretend there is, but everyone loves chocolate so who cares.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Intelligent flash storage arrays
Tegile Intelligent Storage Arrays with IntelliFlash helps IT boost storage utilization and effciency while delivering unmatched storage savings and performance.