Feeds

DataStax adds search to Cassandra NoSQL

Solr powered distributed database

Build a business case: developing custom apps

Structure Data 2012 At the Structure Data 2012 conference in New York this week, DataStax, which as commercialized the Apache Cassandra NoSQL database originally created by Facebook and open sourced as an Apache project, has bolted on search to the data store and a plug in that lets it also search and index application logs.

The new search functions come thanks to the welding of the open source Solr search engine, which like Cassandra is written in Java. Solr is a variant of the Lucene search engine and is an Apache project as well. It adds REST and JSON APIs to the Lucene search engine.

DataStax chose Solr as its search engine to run atop the Cassandra distributed database inside of its DataStax Enterprise 2.0 release in part because it is somewhere on the order of five to six times more popular than the Apache Hadoop batch-oriented MapReduce data muncher.

Sometimes, rather than crunching data to do filtering and correlations, you just want to search it, and Solr is fast and now can ride atop Cassandra. Companies that already like Lucene and want something that is faster and that now has no single point of failure, thanks to the replication and clustering inherent in Cassandra, can now go with DSE 2.0.

In addition to Solr, DSE 2.0 has rifled around the other Apache projects and snapped in Log4j, a logging services layer, into Cassandra as well. The application logs, which help programmers debug their code, are now stored in Cassandra and are fully indexable and searchable.

The updated release also includes what DataStax calls elastic workload partitioning, which allows for x86 server nodes to be stood up running either real-time Cassandra apps or batch-oriented Hadoop MapReduce jobs. As the workloads go through their peaks and valleys, you dial up Cassandra nodes and dial back Hadoop nodes, usually during the day, and then reverse the process at night when you are in batch mode sifting through all of the data you collected during the day for correlations and associations.

Finally, the DSE 2.0 release comes with Sqoop, another Apache project (this one is in incubation phase), which is used to import data from corporate relational databases into HBase, a quasi-relational data store that runs atop the Hadoop Distributed File System, or the Hive data-warehousing system that also runs atop HDFS, and that allows SQL-like ad-hoc querying of information inside HDFS. Now, Sqoop can also speak Cassandra and suck data from relational databases to this alternative, distributed NoSQL database.

"This is a bit of a developer's paradise," DataStax CEO Billy Bosworth tells El Reg. "You have real-time, batch analytics, search, and logs all in one place. Now developers can focus on building applications and not worry about the back end."

The original DSE 1.0 launched last fall took the Cassandra datastore and added in the Hadoop batch analytics on top of it – MapReduce, Hive, Pig, and so forth – eliminating the master-slave node bottleneck and single point of failure of the kosher Hadoop-HDFS combination without breaking API compatibility with Hadoop. For all Hadoop knows, it is running on HDFS when it is running atop Cassandra.

The DSE distribution also includes a number of closed source elements, such as features that guarantee workload isolation between Hadoop algorithms and Cassandra code, the OpsCenter visual management tool, and other analytics features – but you can only get these by buying a license and paying for support for the DSE variant. However, if you want to just use the Apache Cassandra database, DataStax is happy to sell you a support contract for that.

DSE 2.0 is available now, and while DataStax does not provide list pricing, Bosworth said that it was on the order of a couple of thousand dollars per server node. "It tends to be an order of magnitude less than a relational database," says Bosworth. ®

Maximizing your infrastructure through virtualization

More from The Register

next story
HIDDEN packet sniffer spy tech in MILLIONS of iPhones, iPads – expert
Don't panic though – Apple's backdoor is not wide open to all, guru tells us
Do YOU work at Microsoft? Um. Are you SURE about that?
Nokia and marketing types first to get the bullet, says report
Microsoft takes on Chromebook with low-cost Windows laptops
Redmond's chief salesman: We're taking 'hard' decisions
Cheer up, Nokia fans. It can start making mobes again in 18 months
The real winner of the Nokia sale is *drumroll* ... Nokia
EU dons gloves, pokes Google's deals with Android mobe makers
El Reg cops a squint at investigatory letters
Chrome browser has been DRAINING PC batteries for YEARS
Google is only now fixing ancient, energy-sapping bug
Big Blue Apple: IBM to sell iPads, iPhones to enterprises
iOS/2 gear loaded with apps for big biz ... uh oh BlackBerry
prev story

Whitepapers

Seven Steps to Software Security
Seven practical steps you can begin to take today to secure your applications and prevent the damages a successful cyber-attack can cause.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
Designing a Defense for Mobile Applications
Learn about the various considerations for defending mobile applications - from the application architecture itself to the myriad testing technologies.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.
Consolidation: the foundation for IT and business transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.