Feeds

Cloudera Hadoop plugs trunk into Netezza iron

Stuffed elephant meets data warehousing

Intelligent flash storage arrays

Cloudera – the commercial Hadoop outfit – has teamed with data analytics maven Netezza to build a connector between its stuffed elephant distro and Netezza's Twinfin data warehousing appliances

Due at the end of the third quarter, the connector will allow users to move data from Netezza appliance to the Cloudera Distribution for Hadoop (CDH) – and vice versa.

Based on research papers describing Google’s proprietary infrastructure, the open source Hadoop is a way of crunching massive amounts of data across a network of distributed machines. Named after the yellow stuffed elephant belonging to the son of project founder Doug Cutting, the platform underpins net services offered by everyone from Yahoo! and Facebook and Twitter to Microsoft. Yes, Microsoft.

Meanwhile, Netezza's TwinFin blade servers offer a customized PostgreSQL database. Like other data warehouses, running ad hoc SQL queries against epic data sets.

"One thing we have seen at Cloudera is substantial existing use of Netezza's product in our big enterprise accounts," Cloudera CEO Mike Olson tells The Reg. "And they're looking as Hadoop as a complement to the existing Netezza use. We view ourselves as another piece of the puzzle, solving a different problem: complex data and hard core exhaustive analytics, more exotic algorithms running over complex data at scale."

But Olson also stresses that after its crunched by Hadoop, users will be able to move data back to the Netzza appliance for additional exploration. "Enterprises want to take structured data – customer and transaction data – and combine it will all the unstructured data coming off their websites...that might not fit into a tabular schema well."

Hadoop, for instance, might be used to crunch data relating to user behavior on a website. "What we call Web 2.0 sites have users that move around on their site, post status updates, interact with other individuals. All of that activity is captured in web logs that can't easily be digested using existing relational system. Hadoop can look at all that activity, identify individual users, digest their behavior, and begin to make predictions about behavior," Olson continues.

"But these companies want to combine want these users do with who they are, but that information is often in a system like Netezza's."

"A lot of people assumed that Hadoop was innately competitive with Netazza and other players," says Olson. "But that's not what we're seeing. We're seeing an appetite for this new technology [Hadoop] to solve problems with data – but it has to work well with existing and expanding investments in [data warehousing]."

Last month, Cloudera teamed with Oracle-tools shop Quest Software to build a Hadoop connector for Oracle. It's due in Q3 as well. ®

Choosing a cloud hosting partner with confidence

More from The Register

next story
Just don't blame Bono! Apple iTunes music sales PLUMMET
Cupertino revenue hit by cheapo downloads, says report
The DRUGSTORES DON'T WORK, CVS makes IT WORSE ... for Apple Pay
Goog Wallet apparently also spurned in NFC lockdown
Hey - who wants 4.8 TERABYTES almost AS FAST AS MEMORY?
China's Memblaze says they've got it in PCIe. Yow
IBM, backing away from hardware? NEVER!
Don't be so sure, so-surers
Microsoft brings the CLOUD that GOES ON FOREVER
Sky's the limit with unrestricted space in the cloud
This time it's SO REAL: Overcoming the open-source orgasm myth with TODO
If the web giants need it to work, hey, maybe it'll work
'ANYTHING BUT STABLE' Netflix suffers BIG Europe-wide outage
Friday night LIVE? Nope. The only thing streaming are tears down my face
Google roolz! Nest buys Revolv, KILLS new sales of home hub
Take my temperature, I'm feeling a little bit dizzy
prev story

Whitepapers

Choosing cloud Backup services
Demystify how you can address your data protection needs in your small- to medium-sized business and select the best online backup service to meet your needs.
A strategic approach to identity relationship management
ForgeRock commissioned Forrester to evaluate companies’ IAM practices and requirements when it comes to customer-facing scenarios versus employee-facing ones.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Saudi Petroleum chooses Tegile storage solution
A storage solution that addresses company growth and performance for business-critical applications of caseware archive and search along with other key operational systems.
How to simplify SSL certificate management
Simple steps to take control of SSL certificates across the enterprise, and recommendations centralizing certificate management throughout their lifecycle.