What do we want? Strong consistency! When do we... oh, it's in Riak v2

NoSQL datastore flexes muscles to woo enterprises

Internet Security Threat Report 2014

RICON West 2013 Riak-steward Basho has spliced crucial enterprise features into the second version of its NoSQL distributed database, and also admitted that its system can't do everything on its own.

The technical preview of version two of the software was released at the company's RICON West conference in San Francisco on Tuesday, bringing with it an option for strong consistency, better access-control policies, advanced security, and mix-and-match replica allocations.

What will make enterprises salivate, we reckon, is the arrival of strong consistency.

Riak previously offered just eventual consistency, which is relatively fast but the precise value of an accessed object can be uncertain for a small amount of time. Now the system also provides strong consistency, which guarantees the integrity of every transaction but it's a relatively slower process.

Simply put, you'd use eventual consistency for retrieving prices for stuff in an online marketplace, for instance, but use strong consistency when calculating the customer's total at final shopping basket checkout.

Consistency is crucial to distributed databases, and is one of the three key elements in Eric Brewer's CAP theorem, which states that databases can have any two of consistency (C), availability (A) and partitioning (P), but never all three.

Riak had previously had the A and P parts, but now it can have the C, for some workloads some of the time, and with caveats.

"You can choose eventual or strong consistency," Basho's chief technology officer Justin Sheehy says. "Nobody gets to beat CAP, but for a named subset of your data you can choose to have a very different mechanism used for propagating writes and reads to replicas."

This approach may increase overall latency, but will give enterprises the ability to have strong guarantees when accessing a subset of their data.

"In all the cases where Riak would normally accept a conflicting value, instead all but one of those conflicting values [will] fail loudly back to the client," Sheehy explained.

Conflict-free replicated data types

To support the use of Riak as a distributed data store, the company has also added in distributed data types – 'sets', 'registers', 'flags', and 'maps' – based on research into Conflict Free Replicated Data Types. This, the company says, should "enable developers to spend less time thinking about the complexities of vector clocks and sibling resolution and, instead, focusing on using familiar, distributed data types to support their applications’ data access patterns." Further information on the new technology is available in this document on Basho's GitHub pages.

The company has also made various tweaks around usability, including shifting configuration management away from Erlang literal syntax to a standardized syscontrol file format.

This should make the database easy to maintain even for people not directly familiar with it, Sheehy said.

"Someone that's done much of any administration of their servers they will immediately understand it and edit it without any confusion," he said.

Riak's replication model has been given a tuneup, the company said. Where previously IT departments needed to store three copies of their data in every data center, they can now change this number according to their needs.

This lets sysadmins store fewer or more copies of replicated data across multiple facilities, and they can mix and match the amount data replicated as required. For instance, if a business has a multitude of co-location facilities around the world then it may want to have three replicas in its primary facility and single copies in others.

Though Basho has added several features to the new version of Riak, it has also stepped back from others, and Riak 2 will see the company offer full search integration with the Apache SOLR project, rather than do its own search tech.

"Some of the best people in the world have been working on Apache SOLR for years," Sheehy said, then noted that as Riak has expertise in distributed systems and databases it would be "too much hubris" to try for search as well.

Though Riak originally began life as a clever implementation of some of the ideas found in Amazon's seminal Dynamo paper, it has since grown into a full-fat database with good reliability properties, and excellent traits for the backing store of a distributed system.

"We started out providing all of our guarantees at the plumbing layer. The earliest versions had an incredibly spartan user interface and we've improved that overtime," Sheehy said. "We built from the bottom up instead of the top down." ®

Top 5 reasons to deploy VMware with Tegile

More from The Register

next story
Docker's app containers are coming to Windows Server, says Microsoft
MS chases app deployment speeds already enjoyed by Linux devs
Intel, Cisco and co reveal PLANS to keep tabs on WORLD'S MACHINES
Connecting everything to everything... Er, good idea?
SDI wars: WTF is software defined infrastructure?
This time we play for ALL the marbles
'Urika': Cray unveils new 1,500-core big data crunching monster
6TB of DRAM, 38TB of SSD flash and 120TB of disk storage
Facebook slurps 'paste sites' for STOLEN passwords, sprinkles on hash and salt
Zuck's ad empire DOESN'T see details in plain text. Phew!
'Hmm, why CAN'T I run a water pipe through that rack of media servers?'
Leaving Las Vegas for Armenia kludging and Dubai dune bashing
Windows 10: Forget Cloudobile, put Security and Privacy First
But - dammit - It would be insane to say 'don't collect, because NSA'
Oracle hires former SAP exec for cloudy push
'We know Larry said cloud was gibberish, and insane, and idiotic, but...'
prev story


Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Win a year’s supply of chocolate
There is no techie angle to this competition so we're not going to pretend there is, but everyone loves chocolate so who cares.
Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Saudi Petroleum chooses Tegile storage solution
A storage solution that addresses company growth and performance for business-critical applications of caseware archive and search along with other key operational systems.