Feeds

Big Blue Google cloud injected with $5m

How to simulate an ocean

The essential guide to IT transformation

The US National Science Foundation has tossed $5 million at Google's effort to educate the country's university students in the ways of Big Data.

Back in the fall 2007, Google teamed with IBM to provide various universities with access to a dedicated compute cluster where students could explore the sort of mega-data-crunching techniques that unpin its web-dominating search engine. Both Google and Big Blue shoved between $20m to $25m behind the initiative, and today, the NSF announced a roughly $5 million grant that will fund the data-crunching research of 14 separate institutions, including MIT, Yale, Carnegie Mellon, and University of Utah.

"The computational and storage resources provided by this Google-IBM initiative allows us to perform complicated interactive analysis of a pretty-much unprecedentedly large amount of data," Claudio Silva, associate professor at the University of Utah, tells The Reg. "It has the ability to completely transform the way we do data analysis and visualization...

"The computing centers that companies like Microsoft, Amazon, and Google are using are even larger than anything the government has built."

For instance, Silva says, the university will use Google's distributed compute power to crunch vast amounts of data on behalf of NSF oceanographers. "The project looks to do coastal observation and prediction...We have a lot of sensor and simulated data involving the Columbia River and the Pacific Northwest Ocean, and right now, it takes an enormous amount of time to shift through all the data and answer the questions that need answering."

You see, Google is interested in prepping the country's top computer science students for life at Google. That research compute cluster runs Hadoop, an open source platform based on Google's distributed file system, GFS, and its software framework for distributed data-crunching, known as MapReduce.

According to Christophe Bisciglia - the former Google engineer who recently jumped ship for the Hadoop startup Cloudera - the cluster sits inside one of Google's famously podified data centers. Biciglia has told The Reg that the cluster was set up in a ring-fenced portion of the data center scheduled for "decommissioning" back in 2007.

Before he left Google, Bisciglia taught a course on Googlicious Big Data at his alma mater, the University of Washington, and the Hadoop-happy curriculum - since open sourced under a Creative Commons license - is now taught at several other universities across the country. Meanwhile, IBM has provided students with Eclipse-based open source tools for building their own apps atop Hadoop.

Hadoop was founded by a man named Doug Cutting, who now works at Yahoo!. The company now backs at least a portion of its web operation with Hadoop, and like Google and IBM, it's working to prepare the next generation of computer scientist for interweb-scale data transformations on low-cost distributed machines. Yahoo! offers up its own Hadoop research cluster, the M45, to various American universities.

But as Hadoop educates the world in Big Data, Google continues to keep its veil of secrecy over the particulars of its own GFS and MapReduce. Naturally. ®

Secure remote control for conventional and virtual desktops

More from The Register

next story
Apple promises to lift Curse of the Drained iPhone 5 Battery
Have you tried turning it off and...? Never mind, here's a replacement
Mozilla's 'Tiles' ads debut in new Firefox nightlies
You can try turning them off and on again
Linux turns 23 and Linus Torvalds celebrates as only he can
No, not with swearing, but by controlling the release cycle
Scratched PC-dispatch patch patched, hatched in batch rematch
Windows security update fixed after triggering blue screens (and screams) of death
This is how I set about making a fortune with my own startup
Would you leave your well-paid job to chase your dream?
prev story

Whitepapers

Endpoint data privacy in the cloud is easier than you think
Innovations in encryption and storage resolve issues of data privacy and key requirements for companies to look for in a solution.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Advanced data protection for your virtualized environments
Find a natural fit for optimizing protection for the often resource-constrained data protection process found in virtual environments.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.