Feeds

Hadoop Hive stung into action, swarms around SQL

More relational, more useful to humans, we're promised

Secure remote control for conventional and virtual desktops

Hortonworks has unveiled the Stinger Initiative, a project to make Hadoop’s Hive data warehouse friendlier with SQL and faster.

Hortonworks has also unveiled two accompanying Hadoop projects, which it’s submitted to the Apache Software Foundation (ASF) in the hope they become community-supported projects. They are a runtime called Tez and a sign-in and authentication system called Gateway. Both Tez and Gateway are ASF incubator projects. You can read more about them here.

Hadoop services startup Hortonworks said Stinger would “enhance Hive with more SQL and better performance” for what it called “human-time use cases”.

Translated, Stinger should make Hive friendlier and faster to use in data querying and analytics normally undertaken by SQL and relational tools.

Hive, like the rest of the Hadoop architecture, has thrived on crunching batches of data – Hadoop is a open-source implementation of Google’s MapReduce and a NoSQL system.

However, the NoSQL crowds realised they need to make their architectures work better with SQL-like tools used by businesses in the real world.

The standard SQL interface for Hive was HiveQL, but it doesn't match the latest SQL standard - and support for HiveQL is not widespread, so banking your data infrastructure on it is a bit of a gamble. ASF's HiveQL project web page is depricated, and simply points you to the HiveQL programming manual.

According to Hortonworks, Stinger will make Hive “a more suitable tool for the decision support queries people want to perform on Hadoop”.

This means the addition of analytics features such as the OVER clause, support for subqueries in WHERE and aligning Hive’s type system with the standard SQL model.

The plan is to speed up Hive, too. There’s a new executing engine to increase the number of records per second Hive can process, a new columnar file format to provide “a more modern, efficient and high performing” means to store Hive data, and the Tez runtime framework to speed up workload speeds by eliminating unnecessary talks and synchronization barriers and that reads and writes to HDFS.

A preview of Stinger is planned ahead of the Hadoop Summit in Amsterdam in March. ®

Providing a secure and efficient Helpdesk

More from The Register

next story
Microsoft on the Threshold of a new name for Windows next week
Rebranded OS reportedly set to be flung open by Redmond
Business is back, baby! Hasta la VISTA, Win 8... Oh, yeah, Windows 9
Forget touchscreen millennials, Microsoft goes for mouse crowd
SMASH the Bash bug! Apple and Red Hat scramble for patch batches
'Applying multiple security updates is extremely difficult'
Apple: SO sorry for the iOS 8.0.1 UPDATE BUNGLE HORROR
Apple kills 'upgrade'. Hey, Microsoft. You sure you want to be like these guys?
ARM gives Internet of Things a piece of its mind – the Cortex-M7
32-bit core packs some DSP for VIP IoT CPU LOL
Lotus Notes inventor Ozzie invents app to talk to people on your phone
Imagine that. Startup floats with voice collab app for Win iPhone
'Google is NOT the gatekeeper to the web, as some claim'
Plus: 'Pretty sure iOS 8.0.2 will just turn the iPhone into a fax machine'
prev story

Whitepapers

A strategic approach to identity relationship management
ForgeRock commissioned Forrester to evaluate companies’ IAM practices and requirements when it comes to customer-facing scenarios versus employee-facing ones.
Storage capacity and performance optimization at Mizuno USA
Mizuno USA turn to Tegile storage technology to solve both their SAN and backup issues.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Beginner's guide to SSL certificates
De-mystify the technology involved and give you the information you need to make the best decision when considering your online security options.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.