DataStax bags $45 million to massage Cassandra
Funding foretells coming storm for trad database vendors
DataStax has trousered $45 million in funding to help it ramp up the sales and distribution of its paid-for and free versions of the Cassandra database.
The funding was announced by the company on Tuesday, along with technical updates to both versions of the open source database.
Cassandra was developed at Facebook in the late-2000s to overcome challenges the team had run into when storing large amounts of data that needed to be recalled quickly.
Like rival "NoSQL" databases, Cassandra takes its key technical influences from the seminal "Dynamo" paper which was published by Amazon in 2007, but unlike rivals it also works its tricks using tech based on Google's 2006 BigTable paper.
This gives the system the eventual consistency and distributed properties of Dynamo, along with the Columnar data model of BigTable.
The series D wonga round was led by Scale Venture Partners, along with new investors DFJ Growth and Next World Capital, and existing investors Lightspeed Venture Partners, Crosslink Capital, and Meritech Capital Partners.
The funding highlights the new phase of maturity that systems such as Cassandra, MongoDB, Riak, and CouchDB have entered into, as their technologies have become easier to use and people have adjusted to the newer way of doing things that their databases mandate. MongoDB's Series E round was $42 million, for example, and Basho raised $6 million in 2012.
DataStax will use the money to fund international expansion, product development, and further positions in sales and marketing.
The company also introduced version 3.1 of its paid-for product DataStax Enterprise, and version 2.0 of its DataStax Community Addition.
The main advancement is the addition of the Cassandra Query Language (CQL) to DataStax Enterprise as a binary protocol with .NET and Java drivers. CQL is a SQL-like query language that makes the system more familiar to people who have previously queried databases entirely using SQL. CQL is close to SQL, but not an exact match as it doesn't support JOINS due to Cassandra's underlying architecture.
"If you're trying to drag a relational system kicking and screaming into a multi-machine clustered environment, then the first thing you have to do is rip out the relational parts from the application," DataStax CTO Jonathan Ellis says. "There's technical reasons behind this decision, and the big one is that unlike the relational world, Cassandra is a distributed system and built to be one from the ground up."
Version 3.1 also integrates with the Solr 4.3 search technology, and is able to manage up to ten times the amount of Cassandra data per node because DataStax is using a tweaked storage engine that does away with size limits imposed by memory collection within the Java heap.
As for the free version, DataStax is wrapping in Cassandra 2.0, and the free version should be released in August. New features include better data compaction, greater reliability when replicating data, a new transaction mechanism named Compare and Set, and the ability to have event-driven operations at the database level via triggers. ®