Yahoo! Hadoop! brain! spin-off! doomed! to! fail!
The brains already left
Requirements Checklist for Choosing a Cloud Backup and Recovery Service Provider
Open...and Shut The once dominant Yahoo is apparently keen to compete with one of today's hottest startups, Cloudera, to own the affections of data plumbers everywhere.
According to The Wall Street Journal, Yahoo is actively considering spinning out its Apache Hadoop engineering team into a startup, possibly backed by Benchmark Capital. The intent? Make a bunch of money from one of the industry's most important technologies, one used by a Who's Who of enterprises. The likely outcome? Utter failure.
Yes, Yahoo still generates more than $6 billion each year. Yes, Apache Hadoop is hot. And, yes, Yahoo CEO Carol Bartz has a foul, trash-talking mouth. But no, the three don't necessarily translate into a billion-dollar opportunity for a Yahoo-sponsored Apache Hadoop startup.
After all, while the intent of the spin-off would be to clear a path for the startup to operate independently of Yahoo, the reality is that Yahoo would remain a major shareholder. Yahoo has seen essentially flat revenues for four years and long ago lost its reputation for being a growth leader. Give the company a board seat and it's hard to foresee much beyond bureaucracy. If VMware's experience is anything to go by, Yahoo is likely to get as many as two board seats.
Indeed, VMware's parent company, EMC, also chairs the VMware board with its CEO, Joseph Tucci. Carol Bartz, Yahoo's CEO, is a great executive, but would this new startup want her leading its board? Doubtful, especially since it could seriously complicate the startup's possible exits and partners.
It's possible, of course, that things could be structured in such a way as to minimize Yahoo's influence on the company. With the right team in place, such a startup could go far, assuming it could find the right people. It's doubtful, however, that those "right people" currently reside at Yahoo, as I'll explain below.
Benchmark's Rob Bearden, former COO at both JBoss and SpringSource, revealed that Benchmark has been actively courting Yahoo about funding the startup. I could easily see Bearden gathering his network of hard-core Java developers and business executives to take on the challenge. I know Bearden, and think highly of his business acumen. He'd be a formidable competitor, despite Cloudera's headstart.
But there's reason to think he'd be too late in pulling together the cream of the Apache Hadoop crop. After all, Cloudera already employs some of the most critical Apache Hadoop developers, including Apache Hadoop co-founders Doug Cutting and Mike Cafarella. Plenty remain at Yahoo, but it feels like the heart and soul of Apache Hadoop is already at Cloudera…and eBay, Facebook, Apple, etc. Few of the brightest Apache Hadoop lights remain at Yahoo.
And why would they? Yahoo's been a tough place to work for a long time. As noted, the number of Apache Hadoop developers employed by Yahoo has shrunk steadily over time, and the character of those developers has also changed. The driven start-up guys left a long time ago. There were reasons to stay at Yahoo, but working in an exciting start-up environment for a company that was transforming the industry wasn't on the list. The people who stuck around were paid well and worked a big-company schedule. In other words, the Apache Hadoop developers who remain at Yahoo are precisely the wrong sort to bet on in startup land.
But they might be the right sort to help drive Yahoo's business. I spoke with a source familiar with Yahoo's business, and he reminded me that Yahoo relies on Hadoop for all portal content personalization: mail, news, spam filtering, etc. This is strategic to Yahoo's business, while core Hadoop development is not.
Bartz would be smart to shed developers so as to trim costs but then, she's already doing that: Apache Hadoop developers have been leaving Yahoo for years. Trying to get a startup spun-out and funded seems like a foolish way to accomplish this goal, with little real chance of disrupting the current momentum in the industry.
Cloudera's business has been booming, too, making it tough on any late entrant to the Apache Hadoop race. While somewhat true in any business, it's particularly the case for open-source businesses.
Open source tends to be winner-takes-all, as open source expert Dave Rosenberg has argued. The first company to own a particular project or category "wins" that category. Middleware? JBoss. Database? MySQL. Web content management? Acquia/Drupal. And so on.
In the case of Apache Hadoop, Cloudera has been winning customers for over two years. Yahoo's startup would first have to sort out the mechanics of this new company, launch, market its Apache Hadoop distribution, overcome the market's suspicions about its involvement with the new company, and actually start to close deals.
Oh, and then there's the problem of building a truly enterprise-ready distribution of Apache Hadoop. Yahoo's Apache Hadoop team is in the business of developing features for a single customer: Yahoo. Running an enterprise software company is hard work. You don't get to spend your time coding up the cool new features -- you need to write a roadmap, ship on time and invest enormous effort in chasing big elephants for your first customers, in order to convince others to follow.
No matter how mature Yahoo thinks its distribution is, it has never been run in production anywhere other than Yahoo. That's not a recipe for immediate, or even long-term success.
This isn't to say that Yahoo couldn't succeed, but if it is to suggest that there may be more efficient ways paths to success. For one, Yahoo could invest in Cloudera. No, this wouldn't give it the same level of ownership or control, but it would give the company a stake in the market leader for Apache Hadoop services. For another, Benchmark could simply hire away Yahoo's best engineers and start a new Apache Hadoop-related company without Yahoo's involvement. This would give Benchmark a cleaner ownership structure without sacrificing anything in terms of intellectual property. (Apache Hadoop is liberally licensed, so anyone can download and use it.)
In fact, the only option that strikes me as doomed to fail is Yahoo starting up a company to provide support and other services around Apache Hadoop. We already have a successful company doing that, one that Yahoo is highly unlikely to beat. ®
Matt Asay is senior vice president of business development at Strobe, a startup that offers an open source framework for building mobile apps. He was formerly chief operating officer of Ubuntu commercial operation Canonical. With more than a decade spent in open source, Asay served as Alfreso's general manager for the Americas and vice president of business development, and he helped put Novell on its open source track. Asay is an emeritus board member of the Open Source Initiative (OSI). His column, Open...and Shut, appears twice a week on The Register.
COMMENTS
@Matt, food for thought...
Your logic is that a Yahoo! spin off would fail because Cloudera is already in that space.
Ok, so as the former Ubuntu COO, how do you justify Ubuntu's existence when RedHat was there first?
I mean by your logic, Ubuntu, SuSE, and all of the other flavors of Linux are doomed to fail because RedHat is already there.
@Matt, the only failure is your article... and here's why.
Lets start with your preamble:
"The intent? Make a bunch of money from one of the industry's most important technologies, one used by a Who's Who of enterprises. The likely outcome? Utter failure."
If you made that statement about a start up in general, you'd have a safe bet. Most start-ups fizzle before anyone knows that they existed. Spin-offs? Even Spin-offs have a high failure rate.
But if Yahoo! spun off their Hadoop work, there is a high chance of success.
1) Access to capital. As you said... Yahoo! would retain a large share of the company and it also has a lot of capital so outside funding wouldn't necessarily be needed.
2) Brand recognition. Yahoo! definitely has a name and any spin off would equally get a lot of good publicity out of the gate.
3) Existing product. Yahoo! has been a major contributor of code to Hadoop. Sure Cloudera has been in this space and had already built up their ecosystem. But Yahoo! also has a lot of internal efforts that are critical to commercializing Hadoop. (Did you read the MR2 blogs?)
Of course you are right. A Yahoo! spin off would still be missing key core components that would be necessary for success.
A) Executive guidance and leadership.
B) Infrastructure for Support
C) Technical Writing Staff
D) Training
E) Professional Services Expertise
F) Professional Sales Team / Marketing
All of these areas are essential and if any Yahoo! spin off hits on these, it could easily out perform Cloudera. (Yes, I know a lot of the guys at Cloudera...)
To your point that Cloudera has already established themselves, yes that's true.
But its also true that being first to market doesn't always mean that you'll end up on top. ;-)
I'll wager that if Yahoo! looks outside of the Silicon Valley and taps the right people on the shoulder... they can outperform Cloudera and gain a serious chunk of market share.
They question you have to ask yourself. Can a Yahoo! spin off company provide better service and value than Cloudera? If so, they will do well.
But what do I know?
;-)
Winning the Open Source category.
"Middleware? JBoss. Database? MySQL. Web content management? Acquia/Drupal. And so on."
Given the author's background it might be amusing to append: "Linux? Red Hat." to that little lot.
But then, as we know, all generalisations are wrong.......

IT infrastructure monitoring strategies
What you need to know about cloud backup
Enabling efficient data center monitoring
Agentless Backup is Not a Myth
Top 10 SIEM implementer’s checklist