Feeds

Teradata hitches Aster hybrid database to Hadoop

SQL-H makes elephants chatty

Beginner's guide to SSL certificates

Hadoop World 2012 Like everyone else in the business analytics racket, Teradata has to come up with ways to integrate its products with batch-style Hadoop data munchers.

The company partnered with Hadoop distie Cloudera in September 2010 to create a pipe between Hadoop clusters and Teradata data warehouses, and now Teradata is providing a little more insight into how it will link Hadoop to its Aster Data hybrid row-column database for analytical processing.

We already knew that Teradata was working on Hadoop integration with Aster Data databases, which can store data and search them by row or column and which has its own SQL-MapReduce algorithms that overlay this massively parallel database and perform similar functions that MapReduce does on a Hadoop cluster – albeit a lot faster and on a lot smaller data sets.

Teradata announced a partnership with Hortonworks, the Hadoop distie that was spun out of the Yahoo! engineering team that actually created Hadoop (or more precisely, what was left after some employees left to form other big data firms, including Cloudera), back in February of this year.

At Hadoop World 2012, Teradata lifted the veil a little bit on how it will do the integration between Hadoop data stores and Aster Data databases as part of a preview of its upcoming Aster Database 5.0 release. It turns out that HCatalog, the metadata overlay for file formats for Hadoop Distributed File System and the different components of the Hadoop stack that is being championed by Hortonworks and that is a key component of its Data Platform 1.0 Hadoop distribution, also announced this week, is the key superglue that will link Hadoop to Aster databases. And so is a query language feature of the future Aster Database 5.0 release called SQL-H.

SQL-H is an extension of ANSI-standard SQL, Steve Wooledge, senior director of marketing at Aster Data, explains to El Reg, and it is one that will allow for business analysts to use SQL-like statements to work through HCatalog to see and query data stored in HDFS and suck that data into memory on the Aster cluster so it can be sorted, diced, sliced, and otherwise analyzed.

The data that SQL-H extracts from HDFS is done without going through Pig, a high-level language to run MapReduce routines, or Hive, an ad hoc query language for Hadoop, which are both Apache projects as well and are usually part of a Hadoop distribution. SQL-H requires the Aster Database – and the forthcoming 5.0 release at that – and can be thought of as a more relational friendly way of getting at Hadoop data than Pig or Hive (at least if you are used to SQL and have no idea how to use Pig or Hive).

The other neat thing about SQL-H, says Wooledge, is that if you want to grab a chunk of data out of Hadoop and plunk it directly into the Aster database for processing later, you can extract the data through HCatalog and save it in Aster Database tables.

How Aster SQL-H hooks into Hadoop HDFS

How Aster SQL-H hooks into Hadoop HDFS

Either way you use the data, inside memory or on persistent disk, you can integrate it with other business intelligence tools, such as the Aprimo marketing automation software now owned by Teradata or MicroStrategy dashboarding and reporting software, just to name two use cases. The point is, business analysts can work through SQL-H and not even know they are smacking against the very alien HDFS file format.

There's no word on what kind of performance this SQL-H add-on for Aster Database 5.0 has, of course, since it is not shipping until the third quarter. And while pricing has not been set yet, Wooledge says that the plan is to charge a "nominal fee" over and above the Aster Database license fees rather than a lot more because Teradata believe that once customers start playing with SQL-H, they will want to store significant amounts of data inside Aster Database rather than culling it from HDFS repeatedly. This will, of course, drive Aster Database sales, and that is really the point of this exercise. ®

Beginner's guide to SSL certificates

More from The Register

next story
Azure TITSUP caused by INFINITE LOOP
Fat fingered geo-block kept Aussies in the dark
NASA launches new climate model at SC14
75 days of supercomputing later ...
Yahoo! blames! MONSTER! email! OUTAGE! on! CUT! CABLE! bungle!
Weekend woe for BT as telco struggles to restore service
You think the CLOUD's insecure? It's BETTER than UK.GOV's DATA CENTRES
We don't even know where some of them ARE – Maude
DEATH by COMMENTS: WordPress XSS vuln is BIGGEST for YEARS
Trio of XSS turns attackers into admins
Cloud unicorns are extinct so DiData cloud mess was YOUR fault
Applications need to be built to handle TITSUP incidents
BOFH: WHERE did this 'fax-enabled' printer UPGRADE come from?
Don't worry about that cable, it's part of the config
Astro-boffins start opening universe simulation data
Got a supercomputer? Want to simulate a universe? Here you go
prev story

Whitepapers

Why and how to choose the right cloud vendor
The benefits of cloud-based storage in your processes. Eliminate onsite, disk-based backup and archiving in favor of cloud-based data protection.
Getting started with customer-focused identity management
Learn why identity is a fundamental requirement to digital growth, and how without it there is no way to identify and engage customers in a meaningful way.
How to determine if cloud backup is right for your servers
Two key factors, technical feasibility and TCO economics, that backup and IT operations managers should consider when assessing cloud backup.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Internet Security Threat Report 2014
An overview and analysis of the year in global threat activity: identify, analyze, and provide commentary on emerging trends in the dynamic threat landscape.