Big, distributed, and fast: Ehcache sucks up search
Java for the NoSQL generation
Open sourcers running the Ehcache distributed Java cache can now search their data in near real-time by harnessing lashed-together servers.
Terracotta, which bought the Ehcache project in 2009, has released Ehcache 2.4. It features an API extension that lets you perform object-level queries of data held in memory. The API is backwards compatible, so should work on older versions of Ehcache.
According to Terracotta, this lets you avoid performance bottlenecks encountered when crunching large amounts of data on a single server. Also, in using general purpose caching with your existing servers, you can sidestep expensive hardware appliances that funnel data through hundreds of cores, terabytes of memory, and Infiniband.
Terracotta said its architecture targets people crunching terabytes of data, rather than petabytes, and it claims searches of 48 seconds can now be executed in just half a second. Because Ehcache is built Java, you can also build search queries using Java rather than build search queries using a different query language.
Terracotta said it's in talks with business-intelligence tools vendors to plug their tools into the API to enable more sophisticated slicing and dicing of data.
The Ehcache project is used in about 70 per cent of Java caching. The ability to query data held in memory using the system comes as big-data providers look for ways help customers make sense of the information quickly amassing in their big-data silos.
Customers have been deploying a host of NoSQL architectures to catch data because NoSQL is seen as faster and more scalable than SQL databases in large server farms and on large web sites.
But inevitably, people now want to query the information gathered - data like searches, personal updates, and Tweets - but the search tools have been lacking.
Last month, open-source BI vendor Jaspersoft announced its Native Reporting Big Data project to build connectors that can natively query data in NoSQL databases and other stores.
Jaspersoft's project currently offers connectors for NoSQL databases Cassandra, CouchDB, MongoDB, Riak, and Neo4j; the Hadoop and Infinispan data crunching frameworks; key-value store Redis; and massively parallel processing (MMP) analytic database Vertica bought by Hewlett Packard this week.
Terracotta says Ehcache 2.4 is different from the NoSQL stable becuase it offers "tried and tested" enterprise Java that fits into existing Java architectures with "strongly consistent" data across different nodes. "We see ourselves as a bridge between the traditional and the new," Terracotta chief executive Amit Pandey said. ®