Feeds

Big Blue Google cloud injected with $5m

How to simulate an ocean

Intelligent flash storage arrays

The US National Science Foundation has tossed $5 million at Google's effort to educate the country's university students in the ways of Big Data.

Back in the fall 2007, Google teamed with IBM to provide various universities with access to a dedicated compute cluster where students could explore the sort of mega-data-crunching techniques that unpin its web-dominating search engine. Both Google and Big Blue shoved between $20m to $25m behind the initiative, and today, the NSF announced a roughly $5 million grant that will fund the data-crunching research of 14 separate institutions, including MIT, Yale, Carnegie Mellon, and University of Utah.

"The computational and storage resources provided by this Google-IBM initiative allows us to perform complicated interactive analysis of a pretty-much unprecedentedly large amount of data," Claudio Silva, associate professor at the University of Utah, tells The Reg. "It has the ability to completely transform the way we do data analysis and visualization...

"The computing centers that companies like Microsoft, Amazon, and Google are using are even larger than anything the government has built."

For instance, Silva says, the university will use Google's distributed compute power to crunch vast amounts of data on behalf of NSF oceanographers. "The project looks to do coastal observation and prediction...We have a lot of sensor and simulated data involving the Columbia River and the Pacific Northwest Ocean, and right now, it takes an enormous amount of time to shift through all the data and answer the questions that need answering."

You see, Google is interested in prepping the country's top computer science students for life at Google. That research compute cluster runs Hadoop, an open source platform based on Google's distributed file system, GFS, and its software framework for distributed data-crunching, known as MapReduce.

According to Christophe Bisciglia - the former Google engineer who recently jumped ship for the Hadoop startup Cloudera - the cluster sits inside one of Google's famously podified data centers. Biciglia has told The Reg that the cluster was set up in a ring-fenced portion of the data center scheduled for "decommissioning" back in 2007.

Before he left Google, Bisciglia taught a course on Googlicious Big Data at his alma mater, the University of Washington, and the Hadoop-happy curriculum - since open sourced under a Creative Commons license - is now taught at several other universities across the country. Meanwhile, IBM has provided students with Eclipse-based open source tools for building their own apps atop Hadoop.

Hadoop was founded by a man named Doug Cutting, who now works at Yahoo!. The company now backs at least a portion of its web operation with Hadoop, and like Google and IBM, it's working to prepare the next generation of computer scientist for interweb-scale data transformations on low-cost distributed machines. Yahoo! offers up its own Hadoop research cluster, the M45, to various American universities.

But as Hadoop educates the world in Big Data, Google continues to keep its veil of secrecy over the particulars of its own GFS and MapReduce. Naturally. ®

Providing a secure and efficient Helpdesk

More from The Register

next story
Preview redux: Microsoft ships new Windows 10 build with 7,000 changes
Latest bleeding-edge bits borrow Action Center from Windows Phone
Google opens Inbox – email for people too thick to handle email
Print this article out and give it to someone tech-y if you get stuck
Microsoft promises Windows 10 will mean two-factor auth for all
Sneak peek at security features Redmond's baking into new OS
FTDI yanks chip-bricking driver from Windows Update, vows to fight on
Next driver to battle fake chips with 'non-invasive' methods
UNIX greybeards threaten Debian fork over systemd plan
'Veteran Unix Admins' fear desktop emphasis is betraying open source
Entity Framework goes 'code first' as Microsoft pulls visual design tool
Visual Studio database diagramming's out the window
Google+ goes TITSUP. But WHO knew? How long? Anyone ... Hello ...
Wobbly Gmail, Contacts, Calendar on the other hand ...
prev story

Whitepapers

Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
A strategic approach to identity relationship management
ForgeRock commissioned Forrester to evaluate companies’ IAM practices and requirements when it comes to customer-facing scenarios versus employee-facing ones.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
New hybrid storage solutions
Tackling data challenges through emerging hybrid storage solutions that enable optimum database performance whilst managing costs and increasingly large data stores.