Feeds

Yahoo!'s open source elephant loses its daddy

Hadoop founder departure 'unrelated' to MS pact

Build a business case: developing custom apps

Yahoo! is losing the founder of Hadoop, that increasingly popular open source grid platform based on Google's proprietary software infrastructure.

On September 1, after three and a half years with Yahoo!, Doug Cutting will join Cloudera, the commercial Hadoop startup that launched earlier this year. As reported by the New York Times, Cutting announced his departure from Yahoo! this morning at a company meeting.

His announcement comes a little more than a week after Yahoo! agreed to replace its homegrown search tech with Microsoft Bing, a move that will eventually see the destruction of Yahoo!'s largest Hadoop app: the Yahoo! Search Webmap, which provides the company's search engine with a database of all known web pages. But in speaking with The Times, Cutting said his move has nothing to do with the Microhoo pact.

"This has been in the works for awhile and is unrelated," he said. "I am definitely not leaving in any sort of protest, and the thing I like least about this move is that it might be perceived that way."

Cutting did not respond to a request for comment. But Cloudera CEO Mike Olson confirmed that the startup was in talks with Cutting before the Microhoo deal. "Doug's stature in the community as the founder of the project has made him a pretty interesting candidate for us for some time," Olson tells The Reg. "My conversations with Doug preceded [the Microhoo deal] by some amount of time."

Inspired by Google research papers describing Mountain View’s proprietary software infrastructure, Hadoop is a means of crunching epic amounts of data across a network of distributed machines. Cutting first developed the platform for use with Nutch, his open source web crawler, naming it after his son's yellow stuffed elephant. But he was soon hired to help spearhead the project as a Yahoo! employee. The Google-battling web giant became the largest contributor to the Apache-hosted project, and by the beginning of last year, Hadoop had found its way onto the company's production systems, including Webmap.

The platform also underpins such web services as Facebook and Powerset, the semantic search engine that's now part of Microsoft's Bing. But Yahoo! remained the center of the community - at least until now.

In the wake of the Microhoo deal, Yahoo! reaffirmed its commitment to Hadoop, saying it would still power non-search technologies at the company. "Don't Panic!," wrote Hadoop development VP Eric Baldeschwieler. "We are as committed as ever to building a world class open source Cloud Computing infrastructure and Apache Hadoop remains our solution for batch computing. Hadoop is used to solve many, many internet scale problems beyond search at Yahoo."

And Cloudera's Mike Olson believes Yahoo! will remain a major player in the project. "[Cutting's move] helps us in our standing in the community," Olson tells us, "but I don't think Yahoo!'s role is diminished in any way."

But this does seem like a major shift in the project's center. And though Cutting and Cloudera say they were talking before the Microhoo deal, the Microhoo pact is hardly the sort of thing that would put a damper on his move.

In a canned statement, Yahoo! wished Cutting well. "We are very happy to have had Doug as part of our Yahoo! Hadoop team for the past three and a half years," the company said. "In that time we’ve worked together to make Apache Hadoop the most powerful and widely used open source software for handling large data sets and computing at Internet-scale. Moving forward, we wish Doug the best in his new endeavors. We are looking forward to continuing to lead on innovation and investment in Hadoop, as well as to collaborating with Doug and the growing Hadoop community."

As Yahoo! continues to lose its new-found mojo, Cloudera's stock is on the rise. Think of the Silicon Valley-based startup as the Red Hat of the Hadoop world. Cutting joins an all-star lineup of tech veterans, including former Googler Christophe Bisciglia, who Google famously dispatched to the University of Washington to teach a course on what the company likes to call Big Data.

For the past two years, Yahoo! has hosted the annual Hadoop Summitt near its home base in Sunnyvale. But on October 2, it's Cloudera's turn to MC an east coast incarnation: Hadoop World: NYC. You can bet that a certain new hire will be in attendance. ®

Gartner critical capabilities for enterprise endpoint backup

More from The Register

next story
Why has the web gone to hell? Market chaos and HUMAN NATURE
Tim Berners-Lee isn't happy, but we should be
Microsoft boots 1,500 dodgy apps from the Windows Store
DEVELOPERS! DEVELOPERS! DEVELOPERS! Naughty, misleading developers!
'Stop dissing Google or quit': OK, I quit, says Code Club co-founder
And now a message from our sponsors: 'STFU or else'
Apple promises to lift Curse of the Drained iPhone 5 Battery
Have you tried turning it off and...? Never mind, here's a replacement
Linux turns 23 and Linus Torvalds celebrates as only he can
No, not with swearing, but by controlling the release cycle
Scratched PC-dispatch patch patched, hatched in batch rematch
Windows security update fixed after triggering blue screens (and screams) of death
This is how I set about making a fortune with my own startup
Would you leave your well-paid job to chase your dream?
prev story

Whitepapers

Top 10 endpoint backup mistakes
Avoid the ten endpoint backup mistakes to ensure that your critical corporate data is protected and end user productivity is improved.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Backing up distributed data
Eliminating the redundant use of bandwidth and storage capacity and application consolidation in the modern data center.
The essential guide to IT transformation
ServiceNow discusses three IT transformations that can help CIOs automate IT services to transform IT and the enterprise
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.