SciDB: Relational daddy answers Google, Hadoop, NoSQL
Stonebraker doesn't drop ACID
Keep on keeping on
"VoltDB runs more transactions per second on fewer dollars than Exadata 2 on all its hardware," Stonebraker said. "If you give 20 nodes to the elephants, you get 20 nodes of performance. If you give the same 20 nodes to VoltDB, we will go a factor of 40 faster. This particular elephant runs for a while but there are problems with RAC and Exadata 2."
With SciDB coming, Stonebraker believes the technologies he helped turn into a multibillion-dollar industry no longer have the monopoly. But diversity doesn't mean abandoning the architectures of the past and he remains committed to the underlying principles of reliability, integrity and familiarity of SQL and relational in the world of big data.
What comes after SciDB for Stonebraker and for databases? Stonebraker still sees "horrible challenges" in data integration, especially unstructured and semi-structured data.
He quotes an unnamed New-York bank he once advised that was massively decentralized and had no easy way to establish a single customer list. "The bank couldn't tell IBM in Armonk was the same as IBM SA in Madrid and had no way to solve this problem. Eventually, with a lot of pain and suffering, they sent letters to all their customers asking: 'Who are you?'"
"My suspicion is the number-one cause of outages is human error and after that is badly tested apps, and everything else is way down. What we ought to be focused on in terms of high availability is probably not what we are focused on at all."
Scalability is a major issue and he points to Cassandra's father Facebook that runs MySQL "4,000 ways" and has added 9,000 instances of Memcached. "I shudder at how they keep that thing up, because it's glue and bailing wire and application-level recovery. They are desperate for anything that's more scalable and goes faster," he said.
For all that, he sees a world where NoSQL co-exists with relational and some NoSQL systems will survive a likely shakeout and consolidation. Features will also cross over: VoltDB, for example, will get a JSON interface, adding a document management model on top of SQL.
"The future will hold some number - maybe half a dozen or a dozen - of interesting data management alternatives that are very good at what they do and complement database systems like SQL Server, Oracle and DB2. Conventional legacy row stores will be one of the half dozen things and there will be others."
High availability is another concern especially as data gets bigger and services larger and supposedly more critical. Stonebraker's studying outages at an unnamed "major worldwide institution".
"My suspicion is the number-one cause of outages is human error and after that is badly tested apps, and everything else is way down. What we ought to be focused on in terms of high availability is probably not what we are focused on at all," he says.
It was the technology that inspired Stonebraker in 1973 when he started Ingres and stole a march on IBM: "Ted Codd's ideas were clearly superior to Codasyl and IMS. Hence my early interest in the technology," he said. And now, after 40 years?
"It is possible that DBMS research will peter out and there will be no more innovation. However, I doubt it." ®
Sponsored: Hyper-scale data management