Feeds

Big Data is getting too damn big - and nobody is helping to fix this

See that nettle? Time to pop your gardening gloves on, chaps

Choosing a cloud hosting partner with confidence

Storagebod As vendors race to be better, faster and to differentiate themselves in an already busy marketplace, the real needs of the storage teams can be left unmet - and also those of the storage consumer. At times it is as if the various vendors are building dragsters, calling them family saloons and hoping that nobody notices. The problems that I blogged about when I started out blogging seem still mostly unsolved.

Management

Storage management at scale is still problematic; it is still extremely hard to find a toolset that will allow a busy team to be able to assess health, performance, supportability and capacity at a glance. Still, too many teams are using spreadsheets and manually maintained records to manage their storage.

Tools which allow end-to-end management of an infrastructure from rust to silicon and all parts in-between still don’t exist or if they do, they come with large price-tags which invariably do not have a real ROI or a realistic implementation strategy.

As we build more silos in the storage-infrastructure, getting a view of the whole estate is harder now than ever. Multi-vendor management tools are in general lacking in capability with many vendors using subtle changes to inflict damage on the competing management tools.

Mobility

Data mobility across tiers where those tiers are spread across multiple vendors is hard; applications are generally not currently architected to encapsulate this functionality in their non-functional specifications. And many vendors don’t want you to be able to move data between their devices and competitors' ones - for obvious reasons.

But surely the most blinkered flash start-up must realise that this needs to be addressed; it is going to be an unusual company which will put all of its data onto flash.

Of course this is not just a problem for the start-ups but it could be a major barrier for adoption and is one of the hardest hurdles to overcome.

Scaling

Although we have scale-out and scale-up solutions, scaling is a problem. Yes, we can scale to what appears to be almost limitless size these days but the process of scaling brings problems. Adding additional capacity is relatively simple; rebalancing performance to effectively use that capacity is not so easy. If you don’t rebalance, you risk hotspots and even under-utilisation.

It requires careful planning and timing even with tools; it means understanding the underlying performance characteristics and requirements of your applications. And with some of the newer architectures that are storing metadata and de-duping, this appears to be a challenge to vendors. Ask questions of vendors as to why they are limited to a number of nodes; there will sheepish shuffling of feet and alternative methods of federating a number of arrays into one logical entity will quickly come into play.

And then mobility between arrays becomes an issue to be addressed.

Deterministic Performance

As arrays get larger, more workloads get consolidated onto a single array - and without the ability to isolate workloads or guarantee performance, the risk of bad and noisy neighbours increases. Few vendors have yet grasped the nettle of QoS and still fewer developers actually understand what their performance characteristics and requirements are.

Data Growth

Despite all efforts to curtail this, we store ever larger amounts of data. We need an industry-wide initiative to look at how we can better curate and manage data. And yet if we solve the problems above, the growth issue will simply get worse ... as we reduce the friction and the management overhead, we’ll simply consume more and more.

Perhaps the vendors should be concentrating on making it harder and even more expensive to store data. It might be the only way to slow down the inexorable demand for ever more storage. Still, that’s not really in their interest.

Sometimes one does wonder why all these problems persist ... ®

Security for virtualized datacentres

More from The Register

next story
It's Big, it's Blue... it's simply FABLESS! IBM's chip-free future
Or why the reversal of globalisation ain't gonna 'appen
'Hmm, why CAN'T I run a water pipe through that rack of media servers?'
Leaving Las Vegas for Armenia kludging and Dubai dune bashing
Bitcasa bins $10-a-month Infinite storage offer
Firm cites 'low demand' plus 'abusers'
Facebook slurps 'paste sites' for STOLEN passwords, sprinkles on hash and salt
Zuck's ad empire DOESN'T see details in plain text. Phew!
CAGE MATCH: Microsoft, Dell open co-located bit barns in Oz
Whole new species of XaaS spawning in the antipodes
Microsoft and Dell’s cloud in a box: Instant Azure for the data centre
A less painful way to run Microsoft’s private cloud
prev story

Whitepapers

Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
A strategic approach to identity relationship management
ForgeRock commissioned Forrester to evaluate companies’ IAM practices and requirements when it comes to customer-facing scenarios versus employee-facing ones.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
New hybrid storage solutions
Tackling data challenges through emerging hybrid storage solutions that enable optimum database performance whilst managing costs and increasingly large data stores.