Big data enters open-source hype cycle
Riches for some, mostly not VCs
Open ... and Shut As breathless projections go, IDC's big data market forecast may be in for a serious asthma attack. The venerable analyst firm pegs the brave new world of big data at $16.9bn by 2015. Yet it's unclear just how new this market is and whether anyone but big data start-ups are really cashing in on the gold rush.
Is it the open source hype cycle, replayed in big-data style?
Possibly. Open source was all the rage in the tech press for years as it promised to lower costs while improving enterprise IT freedom. Ultimately, a few start-ups cashed out big time (MySQL, JBoss), but for the most part the real value in open source came as both IT vendors and in-house IT organisations turned to open source to provide raw material for their software projects. Open source became less about sales and more about code, which was exactly what it was designed to do.
Today, venture capitalists are throwing piles of cash into big data start-ups hoping to strike it rich, and some undoubtedly will. But let's be clear: data analytics has long been part of the tech industry. We may choose to call it "Big Data" now but it has been a staple of forward-thinking industries for at least 20 years, as one blogger notes.
Call it data warehousing and data mining. Call it business analytics. Call it whatever you want. It's not new, and it's not even necessarily a game changer, given that many industries have long been optimising for data collection and analytics, potentially leaving little room to significantly improve.
However, there are at least two big areas that the new big data, much like open source, trumps its antecedents: cost and scale.
These two factors, perhaps more than anything else, account for the startling rise in Hadoop's popularity, even as the more staid "data mining" has lost its lustre. Hadoop makes the collection and analysis of data possible on low-cost, easily scaled, commodity hardware. In the past a financial services company that wanted to run credit analysis jobs had to pay an IBM a huge check to cover the cost of the proprietary hardware and software.
Not anymore. Hadoop has democratised data, turning it into a competitive market.
Not that Hadoop was born in a vacuum. A variety of other things - including, for example, rising infrastructure capacity, mobile devices and social data - has contributed to making Hadoop highly relevant to a broad number of people. Importantly, as with open source before, the real value of Hadoop and the big-data movement is being captured within the enterprise, rather than being paid out to different vendors. Yes, there will be home-run exits for Hadoop-savvy start-ups, but the bigger win is all of the internal Hadoop expertise that will be hired and developed.
In sum, yes, big data is big. But it's not really new. What is new is the ability to process immense quantities of data for pennies on the data warehousing dollar. And, similar to open source, most of the big data value is likely to be captured by in-house teams that know how to put their industry knowledge to use in interpreting the data they collect. The good news, then, is that some entrepreneurs will strike it rich in the big data gold rush.
The even better news is that most of the riches will remain with the enterprises enabled by Hadoop and other big-data technologies. ®
Matt Asay is senior vice president of business development at Nodeable, offering systems management for managing and analysing cloud-based data. He was formerly SVP of biz dev at HTML5 start-up Strobe and chief operating officer of Ubuntu commercial operation Canonical. With more than a decade spent in open source, Asay served as Alfresco's general manager for the Americas and vice president of business development, and he helped put Novell on its open source track. Asay is an emeritus board member of the Open Source Initiative (OSI). His column, Open...and Shut, appears three times a week on The Register.