DataStax cranks up Facebook NoSQL to 3.0 with enterprise features
Weaving in the latest Hadoop data muncher and Solr search
DataStax, the company that was founded to take the Cassandra NoSQL data store created by Facebook commercial and therefore usable by mere enterprise data centers, is keeping to its cadence and is rolling up a new release of its DataStax Enterprise Edition.
The company has also put out an update to its Community Edition, which is available for free and which does not include some of the proprietary integrations between the Cassandra data store and the Hadoop big data muncher and the Solr search engine that have been tweaked to run atop of Cassandra.
With the Enterprise Edition 3.0 update, DataStax is taking on one of the big criticisms of Hadoop in the enterprise: that the software doesn't have sufficient security and access control to be put into data centers and linked on the same networks with production back-end systems.
Robin Schumacher, vice president of product management at DataStax (who used to have the same role at MySQL before and after the Sun Microsystems acquisition of that open source database many years ago), tells El Reg that the basic Kerberos integration that Hadoop offers is a good start, but it is not good enough.
To that end, DataStax has added a bunch of features to its enterprise data munching stack, and is releasing some of these to the open source community edition of the Cassandra data store as well.
The open source tweaks include internal authentication and internal object permissions, with the same grant/revoke paradigm used by relational databases also being applied to the NoSQL data store – in this case, it is done at a table or column level. Databases also have row-level locking, but there is no analogy to this in a NoSQL data store.
DataStax has also added client-to-node encryption based on the familiar SSL protocol to make sure that data being passed between Cassandra and an end user device is encrypted in flight.
For those customers who pay for DataStax Enterprise Edition 3.0, there are some goodies that are not available in the community edition. These include external authentication, which allows for the Cassandra, Hadoop, and Solr modules of the stack to be hooked into existing Kerberos or LDAP servers to lock down access to all three.
The enterprise edition also has data encryption for data at rest inside the stack, with AES-128 being the preferred encryption method but other algorithms being available. And finally, the 3.0 update of Enterprise Edition has data auditing features, which allows you to set audit trails for writes, reads, log-ins, errors, and just about every aspect of the stack.
DataStax welds together the Cassandra NoSQL data store, the
Hadoop Big Data muncher, and the Solr search engine
The DataStax Enterprise Edition includes the commercial version of Cassandra, which is called DataStax Enterprise Database Server, technically, as well as a homegrown management tool called OpsCenter Enterprise, a graphical tool with a command line interface that allows you to manage Cassandra, Hadoop, and Solr from the same pane of glass.
Schumacher says that you can set up a ten-node cluster running the three elements of the stack in about three minutes. "It's pretty fast and does what it does well," says Schumacher.
The OpsCenter manager also now has a restore feature that is a companion to the backup feature that has been part of the bundle since last year, and it can do object-level restores, which are particularly valuable to Hadoop shops. OpsCenter also has a visual object manager that allows you to create new Cassandra tables in a few points and clicks.
The enterprise edition of the distro also includes 24x7 tech support coverage on the elements of the stack, which include Cassandra 1.1.8, Hadoop 1.0.4, and Solr 4.0.
DataStax Community Edition 1.2, also just announced, is based on more recent and less-hardened versions of the open source Cassandra 1.2 NoSQL data store plus a freebie and open source variant of the OpsCenter manager and sample databases and applications to get users started if they are unfamiliar with Cassandra.
This update to the core Cassandra has a new feature called Virtual Nodes, which is not a way of virtualizing a server node running the NoSQL database but rather a way of multi-streaming access to data chunks and reconstructing missing data if a physical node fails  that allows recoveries to be done much quicker.
These VNodes are not required to use Cassandra 1.2, you can use the existing node structure, and presumably when the Cassandra 1.2 comes to DataStax Enterprise Edition later this year, the same will hold true.
DataStax Enterprise Edition is available now for early adopter customers and will take about a month before the final kinks are worked out and it is ready for prime time, according to Schumacher.
DataStax is still not publishing list prices for its software and support services, but Schumacher did confirm to El Reg that pricing has not changed since the 2.0 release came out last May . At that time last year, DataStax said that it cost on the order of a few thousand dollars per physical server node for the enterprise edition of its code, which it said was about an order of magnitude lower than what an enterprise-grade relational database costs. ®