FoundationDB uncloaks ACID-compliant NoSQL beta
Lets developers have infinite cake and eat it in ACID-compliant bites
Skeptical developers can now get their hands on a NoSQL database that supports ACID-compliant transactions, a handy tech for web applications that expect their usage to take off like a rocket.
A beta version of the FoundationDB NoSQL database became available for download on Monday. It differs from other NoSQL systems in that it supports ACID-compliant transactions, while having the scalability characteristics of a NoSQL datastore.
"What we're trying to do here is not even build a database per se, we're tring to build a transactional storage substrate that can then be used to expose a variety of different data models," FoundationDB cofounder David Rosenthal told The Register.
The technology is a good candidate to become the storage layer for huge net-orientated applications, Rosenthal noted.
A vast website with tons of requests per second along with an equally huge number of writes would be an ideal proving ground for the tech, he said. (Reddit would be one such example of a site that fits this scenario.)
By allowing for ACID (Atomicity, Consistency, Isolation, Durability) transactions, the FoundationDB system can give strong guarantees around data.
This is handy if you're building a massive web application that cannot mess up at all for fear of customer reprisals, such as a site that helps to track the flow of goods for a logistics company.
At its basic level, FoundationDB is a key-value store that implements ACID-compliance by running workloads via a language based on C++ named Flow, which the machines in a cluster use to talk to one another to perform transaction processing and conflict resolution.
Unlike some NoSQL systems that also incorporate a rough analytical tier, FoundationDB intends to become an all-purpose storage layer for the next generation of web properties.
The reason why a developer might want this is that having an ACID-compliant data storage layer that is very scalable takes away a lot of the worry about flinging ever-growing amounts of data into multiple different storage infrastructures.
FoundationDB's promise is that you can have tons of read and write requests occurring concurrently through any interface layer you like – for example, a document data model like MongoDB, or even a simple Python script – and the system underneath will "just work".
This means that as more people use your application it is trivial to increase its storage, since you just need to throw more machines at your hardware cluster rather than needing to successively rewrite your application.
It's worth noting that developers will still be working within the nebulous world of NoSQL, so they could get bitten by problems introduced when adding SQL-like layers on top of FoundationDB for querying purposes.
A tool for Google-scale problems
No decent ideas are born in a vacuum, and FoundationDB's heritage can be traced back to ideas conceived at Google after the web giant ran into problems with its own systems.
The technology's proposition is roughly equivalent to Google's "Spanner", a globe-spanning database that Google developed after Mountain View engineers ran into problems with a NoSQL storage substrate called BigTable.
Both Spanner and FoundationDB are designed to run in a distributed manner and can house large amounts of data. Google's Spanner is expected to eventually scale to embrace millions of nodes. FoundationDB's beta can handle many nodes and terabyte-scale databases at launch, but expects to eventually scale by "a couple of orders of magnitude," Rosenthal said.
"We're not quite at the data volume that a massive Hadoop installation would be at today," he told The Reg. "That's because ... we're not targeting analytics directly. The thing that is special about FoundationDB is it can do a lot of reads and writes in a transactional way."
How many reads and writes?
The company has published detailed metrics based on running off of a $39k 24-machine cluster across a dataset of two billion key-value pairs. It reports a stable 500,000 operations per second of 90 percent read and 10 percent write, 150,000 operations per second if 50/50, and up to 1,080,000 writes per second across blocks of 140 adjacent keys.
The system is available for download beginning on Monday.
"I think FoundationDB is evidence that as organizations have dabbled with NoSQL, they miss features and consistency guarantees that they've grown to expect from conventional [SQL] databases," Mike Lyle, the CTO for TransLattice, a PostgreSQL-based technology that implements some spanner tricks with a SQL flavor, told us via email.
But while FoundationDB's techniques may be solid, the demand for it may be limited, as other companies can implement ACID-like capabilities without going the whole CAP-ACID hog. Some applications may not need its magic.
"DB guys say they love ACID, but most developers don't care – they just want to be comfortable their DB won't lose their data," Bradford Stephens, the CEO of Drawn to Scale, a massive SQL database set on top of Hadoop, told The Register via email. "You can do that by minimizing bugs, choosing good hardware, etc. If you're doing credit cards, ERP, tracking shipping containers, etc. ... then you need ACID/transactions. The average web/mobile app doesn't."
Whether the size of its market is large or limited, FoundationDB's approach to the storage of data is representative of a shift that is percolating through the industry: companies have spent a few years drinking the NoSQL scaling Kool-Aid and are now trying to redesign them to better fit with enterprise models of data consistency and reliability (ACID), or to be easier to deal with to the layman (SQL-query engines).
How this shift will play out will determine not only the scaling properties inherent to future websites, but also the types of massively distributed applications that can be built on top of them. ®