Feeds

Ellison munches unstructured data with Endeca buy

A massive Oracle big data/ e-commerce/analytics mashup

Next gen security for virtualised datacentres

Only weeks after announcing that it is going to create its own Hadoop distribution running atop its own Berkeley DB NoSQL database, Oracle has snapped up Endeca Technologies, which has cooked up a data store called the MDEX Engine and some analytics and e-commerce front ends to it that Ellison & Co. want to weave into their own cohesive big data-commerce suite.

Endeca, which is apparently a bastardization for the German verb entdecken ("to discover"), was founded in 1999 at the height of the dot-com boom. The company was one of the innovators for the faceted search engines that are common on retail sites, which let you pick a subset of an online catalog by product type or brand and then search within it.

The Endeca toolset is a lot more sophisticated these days, and as you can see from this presentation that Oracle put out as part of the acquisition announcement, the company has a number of ways it will be integrating Endeca's wares into the Oracle stack.

Here's the gist of it.

The Oracle 11g database is where you put your structured operational data, and the MDEX Engine is where you plunk your operational semi-structured or unstructured data. So all that talking that Larry Ellison did only three weeks ago about how "we really don't want to have two separate databases", one for structured and the other for unstructured data, well, er, not so much.

The way it is going to work is this. You want to chew on big data, you use Hadoop and Berkeley DB. The output of that gets dumped into the MDEX Engine data store, which sits alongside it – perhaps even in the same Exadata cluster.

The MDEX Engine data store is a columnar database instead of having the row-based orientation of the Oracle database – and most other relational databases, for that matter.

Oracle is using a hybrid columnar compression technique in the Exadata storage servers underlying the Exadata platform, so this is a bit of a mashup there, too. And Teradata has just added columnar support to its data warehousing database, too.

The MDEX Engine doesn't have a set schema, but rather one that changes on the fly (that's the faceted part), and which also has some in-memory attributes like the TimesTen database that Oracle just put at the heart of its Exalytics BI appliance.

The MDEX Engine is a bit funky in that it takes a column of data and stores it partially in memory and partially on disk, and sorts it two ways, one by value and one by key. A tree-structured index is cached in memory to zip through those columns looking for data.

If you look at the datasheet for the MDEX Engine, you might think that you wouldn't need the Oracle database at all. You can pump in data extracted from your production ERP applications, content management systems, and application files, as well as clickstreams and social media data from Twitter and Facebook.

The MDEX Engine runs on 64-bit Windows or Linux platforms, and presumably will be ported to Solaris now that Oracle owns it.

That's not where the mashing up ends, however.

Endeca has a set of applications that ride on top of the MDEX Engine. One is called InFront, and it is used to customize the "customer experience" on retail Web sites, delivering targeted and relevant data on Web pages as customers browse and buy. This is done by paying attention to who you are and what you do.

Another tool that uses information stored in that columnar database is called Latitude, and it's a more traditional BI analytics tool. This will be combined with the Oracle BI suite, which is based on relational OLAP and multi-dimensional OLAP databases, so Oracle can do analytics on unstructured or semi-structured data.

MDEX will also be put side-by-side with Oracle Content Server to give better search and faceted navigation capabilities to Oracle-based Web sites, and Oracle also plans to weave together its ATG Commerce e-commerce software with Endeca's InFront, the latter of which will bring guided navigation to this retailing front-end.

Oracle did not announce the terms of the acquisition for Endeca, but the company has raised $65m in venture capital in four rounds, according to CrunchBase, so presumably Ellison paid a reasonable amount of dough to get his hands on MDEX, InFront, and Latitude before rivals SAP, HP, or IBM did. Endeca has over 600 customers worldwide.

Oracle expects the deal to close before the end of 2011. ®

Gartner critical capabilities for enterprise endpoint backup

More from The Register

next story
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Microsoft: Azure isn't ready for biz-critical apps … yet
Microsoft will move its own IT to the cloud to avoid $200m server bill
Shoot-em-up: Sony Online Entertainment hit by 'large scale DDoS attack'
Games disrupted as firm struggles to control network
Cutting cancer rates: Data, models and a happy ending?
How surgery might be making cancer prognoses worse
Silicon Valley jolted by magnitude 6.1 quake – its biggest in 25 years
Did the earth move for you at VMworld – oh, OK. It just did. A lot
Forrester says it's time to give up on physical storage arrays
The physical/virtual storage tipping point may just have arrived
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
5 things you didn’t know about cloud backup
IT departments are embracing cloud backup, but there’s a lot you need to know before choosing a service provider. Learn all the critical things you need to know.
Why and how to choose the right cloud vendor
The benefits of cloud-based storage in your processes. Eliminate onsite, disk-based backup and archiving in favor of cloud-based data protection.
Top 8 considerations to enable and simplify mobility
In this whitepaper learn how to successfully add mobile capabilities simply and cost effectively.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?