Feeds

Big Data tools cost too much, do too little

SHOCKING REVELATION: Fashionable technology is high maintenance

Maximizing your infrastructure through virtualization

Strata 2013 Hadoop and NoSQL are the technologies of choice among the web cognoscenti, but one developer and technical author says they are being adopted too enthusiastically by some companies when good 'ol SQL approaches could work just as well.

Ever since a team at Yahoo! did their turn at being prometheus and brought Google-magic down to the rest of us via Hadoop, companies have been on a tear to put the technology into play. But the costs are high, the effort is great, and the advantage it grants you can be slight, Tim O'Brien said in a packed session at the O'Reilly Strata conference in Santa Clara on Wednesday.

"There is a feeling afoot that some of the technologies we've been talking about at a conference like this end up having a huge price tag," he said.

Citing huge human costs (you need to hire expensive in-demand people who know how to use Hadoop), pricey implementation (migrate your data into NoSQL or HDFS without it going wonky) and the possibility of unanticipated problems (you may not fully understand what you are using), O'Brien poured water on the fiery enthusiasm with which it's been adopted by the tech world and its dog.

Big data is a necessity at scale: if you're trying to listen to every transatlantic phonecall, you need to use MapReduce. ... if you need to search the entire internet in milliseconds you need to use MapReduce, if you need to run the largest social network in the world you need to use MapReduce. If you don't you can probably scale with a database.

The way companies have adopted the gamut of "big data" technologies ranging from MongoDB to Hadoop or Impala, means that their own stacks have become difficult to maintain and hard to understand, O'Brien said. "The things I'm being asked to support in production. ... I couldn't even tell you how many databases they use."

For a few large-scale companies, "big data" products are a necessity. For others, they could be useful tools, but for some adopters, the use of these technologies could be "pushing solutions on problems where they may not be appropriate," he said.

If you've got 10TB or less of data upon which you want to run analyses, then you can still get by on Postgres or some other typical system, he said. But if you're expecting to be logging a PB of data then you need to make your way to Hadoop or something else soon. "Don't wait," he said.

Eighty per cent of the market is driven by the tip of the tech pyramid, O'Brien said. "I'm not trying to say a [Hadoop-using] startup out there is doing it wrong, but I have worked on projects where I wish they'd use MySQL because they've only had a gigabyte of data."

Even Google, the progenitor of all of this technology via the vaunted BigTable and GFS academic papers, has itself moved away from the techniques pioneered by NosQL and Hadoop community via its recent "Spanner" database.

Spanner looks much more like a relational, SQL-style database than anything else, and where Google goes the world follows. This is already happening with other companies, such as TransLattice re-implementing Spanner's structure, and getting much interest because of it.

Perhaps NoSQL and Hadoop have led some companies down a blind alley? The Register's database desk had many conversations at Strata on Wednesday during which companies bemoaned the diversity of the "big data" ecosystem and wished for consolidation to make life easier for end-users.

Companies and technologies have proliferated, as have marketing budgets, and perhaps, as O'Brien's talk outlines, this has gone too far and bitten some novice adopters. These technologies may be big, but they're only as clever as the company using them. ®

The Power of One eBook: Top reasons to choose HP BladeSystem

More from The Register

next story
Sysadmin Day 2014: Quick, there's still time to get the beers in
He walked over the broken glass, killed the thugs... and er... reconnected the cables*
Auntie remains MYSTIFIED by that weekend BBC iPlayer and website outage
Still doing 'forensics' on the caching layer – Beeb digi wonk
SHOCK and AWS: The fall of Amazon's deflationary cloud
Just as Jeff Bezos did to books and CDs, Amazon's rivals are now doing to it
BlackBerry: Toss the server, mate... BES is in the CLOUD now
BlackBerry Enterprise Services takes aim at SMEs - but there's a catch
The triumph of VVOL: Everyone's jumping into bed with VMware
'Bandwagon'? Yes, we're on it and so what, say big dogs
Carbon tax repeal won't see data centre operators cut prices
Rackspace says electricity isn't a major cost, Equinix promises 'no levy'
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
Application security programs and practises
Follow a few strategies and your organization can gain the full benefits of open source and the cloud without compromising the security of your applications.
How modern custom applications can spur business growth
Learn how to create, deploy and manage custom applications without consuming or expanding the need for scarce, expensive IT resources.
Securing Web Applications Made Simple and Scalable
Learn how automated security testing can provide a simple and scalable way to protect your web applications.