Feeds

SciDB: Relational daddy answers Google, Hadoop, NoSQL

Stonebraker doesn't drop ACID

High performance access to file storage

Battle of the rows

Stonebraker reckons the relational staples such as logging, locking, latching and buffer management that have helped pioneer and maintain a crucial feature of databases - data integrity according to the atomicity, consistency, isolation and durability (ACID) principles - have also become its biggest burden. Processing alone to make these features work soaks up 90 per cent of a transaction's time in terms of CPU cycles, slowing performance and wasting power.

The serial inventor's answer to this particular problem was initially VoltDB. His database speeds things up by moving data into memory and using distributed data partitioning with multi-core processors and server memory. ACID is retained because VoltDB uses single-threaded partitions that run autonomously while data is replicated in a cluster for high availability.

VoltDB claims to be 45 times faster than an Oracle relational database on a Dell PowerEdge R610 cluster based on Intel's Xeon 5550 with near-linear scaling on a 12-node cluster. VoltDB was the product of H-Store-project, a collaboration between Stonebraker's MIT home, Brown University, Yale University and Hewlett-Packard Labs.

Before VoltDB, there was Vertica. This used a column-oriented, shared-nothing architecture with a massively parallel processing (MPP) engine and data compression to reduce storage and speed queries. Vertica claims query results between 50 and 200 times faster than databases that store data in rows. Vertica started as the C-Store project also with Brown and MIT, plus Brandeis University and University of Massachusetts, Boston.

"Talk to the MapReduce guys and they are fanatical about 'not invented here'... MapReduce was written by people who don't understand databases at all."

Stonebraker reckons columnar-databases are quicker than relational databases because they know what they are looking for. They don't need to waste time sorting rows.

VoltDB, Versa, and - soon - SciDB take Stonebraker into a growing tussle against NoSQL over which architecture is "right" in a fight for mindshare and for customers. SciDB is listed as a NoSQL database, here.

Facing off against SciDB, Vertica and VoltDB in a range of scenarios are Hadoop, MapReduce, Cassandra, CouchDB, Amazon's SimpleDB and Memcached - the latter being the distributed memory caching companion to MySQL used for scale and speed. Helping push them are their creators such as Google and Amazon or startups like Cloudera, mega-scale customers such as Twitter and Facebook, and an army of evangelists convinced that NoSQL is the future.

Sparks flew between Stonebraker and the NoSQL movement in 2008 when the relational expert incensed MapReduce fans in a joint blog with DeWitt for calling MapReduce a "giant step backward in the programming paradigm for large-scale data intensive applications".

Stonebraker and DeWitt professed amazement at the hype over how MapReduce represented a "paradigm shift in the development of scalable, data-intensive applications" and called MapReduce a good idea for writing "certain types" of general-purpose computations but lacking many tools and features commonly associated with DBMS that users have come to depend on.

Bloggers stormed back, damning these "so-called" database experts for "not getting" data in the cloud and - like jealous suitors jumping to their lover's defense - demanded a retraction of this "highly inaccurate article" as if it had slandered their beloved MapReduce.

Most missed the point: Stonebraker and DeWitt weren't calling MapReduce a bad database. They were picking up on the fact that MapReduce - like its open-source clone Hadoop - are being used as if they are databases, with more data being dumped in them by customers on a daily basis and with those customers then needing to transact and analyze that data. It's a problem that's been creeping into Memcached and NoSQL, with people now trying to make Memcached and NoSQL work with relational databases.

Was Stonebraker surprised by the flames?

"The NoSQL guys are people who know nothing about databases and their first reaction is to lash out, so I'm not surprised [by the reaction]," he said.

"Talk to the MapReduce guys and they are fanatical about 'not invented here'... MapReduce was written by people who don't understand databases at all," an unapologetic Stonebraker continued. "They produced a thing that worked for their crawling applications. MapReduce was written to support the processing pipeline behind Google."

Turning MapReduce and Hadoop into databases would take a long time and a huge rewrite to inject things like data repositories, indexes, query languages and updates.

Does he recant in the face of such a flaming? Far from it. He's as critical as ever.

"If you are over 35, you are over the hill apparently in math," he claimed. "In computer science, the grey beards like me are still viable, and it's for this reason that what goes around comes around. The young guys haven't seen it before and the problem with our computer science education system is the lessons from the past seem to get lost."

And, it would seem, Google agrees with him.

Accidental SQL supporter

Stonebraker's got little time for those who claim it's the language that's slowing down databases serving big data. Hadoop is written in Java, CouchDB in Erlang, and in-memory key-value persistent storage engine Memcached in C. For Stonebraker, the interface is the problem, not the language. Hence Volt has been rewritten to remove 90 per cent of the overhead associated with OLTP.

"I'm not a particular fan of SQL but I don't mind it. Jettisoning it just to, say, "get record" is a huge mistake."

Interestingly, Stonebroker wrote Ingres in QUEL and left SQL to Ellison. The industry, and history, swung behind SQL, helping catapult Oracle to today's number-one position while Ingres didn't switch to SQL until version six in the mid 1990s - too late to catch Oracle.

High performance access to file storage

More from The Register

next story
Windows 8.1, which you probably haven't upgraded to yet, ALREADY OBSOLETE
Pre-Update versions of new Windows version will no longer support patches
Android engineer: We DIDN'T copy Apple OR follow Samsung's orders
Veep testifies for Samsung during Apple patent trial
OpenSSL Heartbleed: Bloody nose for open-source bleeding hearts
Bloke behind the cockup says not enough people are helping crucial crypto project
Microsoft lobs pre-release Windows Phone 8.1 at devs who dare
App makers can load it before anyone else, but if they do they're stuck with it
Half of Twitter's 'active users' are SILENT STALKERS
Nearly 50% have NEVER tweeted a word
Windows XP still has 27 per cent market share on its deathbed
Windows 7 making some gains on XP Death Day
Internet-of-stuff startup dumps NoSQL for ... SQL?
NoSQL taste great at first but lacks proper nutrients, says startup cloud whiz
US taxman blows Win XP deadline, must now spend millions on custom support
Gov't IT likened to 'a Model T with a lot of things on top of it'
Microsoft TIER SMEAR changes app prices whether devs ask or not
Some go up, some go down, Redmond goes silent
prev story

Whitepapers

Securing web applications made simple and scalable
In this whitepaper learn how automated security testing can provide a simple and scalable way to protect your web applications.
Five 3D headsets to be won!
We were so impressed by the Durovis Dive headset we’ve asked the company to give some away to Reg readers.
HP ArcSight ESM solution helps Finansbank
Based on their experience using HP ArcSight Enterprise Security Manager for IT security operations, Finansbank moved to HP ArcSight ESM for fraud management.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Mobile application security study
Download this report to see the alarming realities regarding the sheer number of applications vulnerable to attack, as well as the most common and easily addressable vulnerability errors.