Feeds

SciDB: Relational daddy answers Google, Hadoop, NoSQL

Stonebraker doesn't drop ACID

Protecting against web application threats using SSL

ACID fan and language lessons

"I can remember a debate in the '70s: assembly language jockeys would say C is too slow. I need to control my own registers. Twenty-five to 30 years later, we know that's not true and compilers are as good as or better than humans at producing machine-optimized code. Just like you'd never bet on assembly language today. You should not bet on low-level database repositories by alleging they are faster than a higher-level language."

As far as Stonebraker is concerned, these NoSQL architectures were built to fix specific problems by the companies that made them. Now, they are being peddled to the wider world. And that's the real problem because they undermine the principles of ACID that have helped guarantee the performance and reliability of data and that have fundamentally underpinned relational and Stonebraker's work. Even as he's gone non-relational with SciDB, Stonebraker said SciDB will comply with ACID.

Sure, Memcached, for example, is popular - used by Twitter, YouTube and Digg - and it's often used in conjunction with MySQL. But Memcached is not ACID-compliant. It might be fine at processing observations, tweets, videos and news, but customers outside the world of Web 2.0 clouds won't want to run things like financials through a Memcached system. Memcached is not alone: most NoSQLers make no bones about dumping ACID.

Stonebraker is more than just an MIT academic - he's part businessman: he's co-founder and chief technology officer for VoltDB, and a co-founder and board member of Vertica. That makes this more than a battle of architectures - it's a fight for customers' dollars.

Stonebraker reckons the NoSQL community has ditched ACID because it's "too expensive" but installing an ACID-free database is a bet against the future. As organizations grow, they will take decisions that inevitably put more of their important data into such systems and it's then that data integrity as guaranteed by ACID will matter.

"I'm a huge fan of ACID," Stonebraker confessed. "The database transaction model has served us well for 30 years and essentially everyone who jettisons it regrets it because it gives you a systematic underpinning for your data. A lot of the NoSQL guys jettison ACID and that's a huge mistake because, by and large, the NoSQL guys are not database experts.

"You might not need ACID now, but database applications live a very long time...requirements may change over that time. If you decide not to run ACID, make sure you never need it in the future," Stonebraker said.

ACID is what businesses need for mission-critical stuff. "I have a friend at a large telco who's not interested in NoSQL because they give up ACID compliance," Stonebraker reckoned.

Of course, Stonebraker is more than just an MIT academic. He's part businessman. He's co-founder and chief technology officer for VoltDB, and a co-founder and board member of Vertica. That makes this more than a battle of architectures. It's a fight for customers' dollars. It was no coincidence that Stonebraker and DeWitt's attack on MapReduce was launched from their Vertica blog.

Vertica's customers include Mozilla - the open-source operation uses Stonebraker's creation with open-source BI Pentaho to analyze billions of Firefox user log records per day in an attempt to improve product R&D. Guess.com, meanwhile, uses Vertica with MicroStrategy to analyze retail and inventory data in its US and European data centers.

Vertica's also landed some Web 2.0 big-data fish: Zynga, maker of the popular FarmVille and Mafia Wars hits on Facebook has come out as a relational fan for analytics. In a statement supporting Vertica 4.0, Zynga's vice president of analytics Ken Rudin called Vertica a "no wind-up toy", running a daily load of 40 million players and 3TB of data across 230 nodes and two clusters on the database's columnar data warehouse.

This wouldn't be so awkward if it weren't for the fact that Facebook has built its own big-data offering, Cassandra, which has its own take on columns. Cassandra's columns require a mind switch for those coming from a relational background, while Vertica provocatively calls itself "the only true enterprise-ready MPP columnar database" with an emphasis on "only" and "columnar database".

Stonebraker's other recent creation, VoltDB, which started operations in 2009, doesn't yet list any customers.

Roasting relational elephants

Stonebraker's not just critical of the NoSQL new wave: he's got plenty of fire left for the relational "elephants," Microsoft and Oracle. Increasingly, their answer to high-end relational processing is to boost the software by fusing it with the underlying hardware.

Oracle's built the Exadata server, a hardware appliance running Oracle's database that combines Smart Flash Cache to reduce bottlenecks and columnar compression to reduce data warehousing table size with solid-state multiterabyte storage arrays to offload data. Microsoft's partnered with Bull, Dell, EMC, HP and IBM on massively parallel processor appliances running SQL Server - SQL Server 2008 R2 Parallel Data Warehouse.

The concepts are similar to Stonebraker's warehousing and analytics work, but Stonebraker has not allowed himself to become married to a small set of certified hardware suppliers with specialized chips or hardware. Stronebraker's goal is to achieve scale through software working on affordable, commodity hardware - taking advantage of multi-core CPUs and greater memory. According to Stonebraker, Oracle and Microsoft can just keep adding more expensive hardware but the fundamental problems or bottlenecks won't be solved.

The next step in data security

Next page: Keep on keeping on

More from The Register

next story
New 'Cosmos' browser surfs the net by TXT alone
No data plan? No WiFi? No worries ... except sluggish download speed
'Windows 9' LEAK: Microsoft's playing catchup with Linux
Multiple desktops and live tiles in restored Start button star in new vids
iOS 8 release: WebGL now runs everywhere. Hurrah for 3D graphics!
HTML 5's pretty neat ... when your browser supports it
Mathematica hits the Web
Wolfram embraces the cloud, promies private cloud cut of its number-cruncher
Google extends app refund window to two hours
You now have 120 minutes to finish that game instead of 15
Intel: Hey, enterprises, drop everything and DO HADOOP
Big Data analytics projected to run on more servers than any other app
Mozilla shutters Labs, tells nobody it's been dead for five months
Staffer's blog reveals all as projects languish on GitHub
SUSE Linux owner Attachmate gobbled by Micro Focus for $2.3bn
Merger will lead to mainframe and COBOL powerhouse
iOS 8 Healthkit gets a bug SO Apple KILLS it. That's real healthcare!
Not fit for purpose on day of launch, says Cupertino
prev story

Whitepapers

Providing a secure and efficient Helpdesk
A single remote control platform for user support is be key to providing an efficient helpdesk. Retain full control over the way in which screen and keystroke data is transmitted.
WIN a very cool portable ZX Spectrum
Win a one-off portable Spectrum built by legendary hardware hacker Ben Heck
Saudi Petroleum chooses Tegile storage solution
A storage solution that addresses company growth and performance for business-critical applications of caseware archive and search along with other key operational systems.
Protecting users from Firesheep and other Sidejacking attacks with SSL
Discussing the vulnerabilities inherent in Wi-Fi networks, and how using TLS/SSL for your entire site will assure security.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.