The myths and legends of the holy land of the database

The first myth explored and exploded

First Myth

"Oracle can't do MOLAP."

In the BI world, this is a serious charge. MOLAP (Multi-dimensional On-Line Analytical Processing) is the Wunderkind of the BI world. So, how does this myth stack up?

OLAP (On-Line Analytical Processing) is a way of organising data so it can be analysed very rapidly by users. It takes a base set of relational data (say, sales of individual products to particular customers, day by day) and calculates a set of aggregated values (eg sales of products to all customers in Essex for each month). The combination of the base and aggregated data is known as an OLAP cube.

The point of building an OLAP cube is that when users ask to see, say, the sales of fridges, month by month in a particular county, the answer comes back very rapidly. OLAP can be implemented in three ways:

• ROLAP
• MOLAP
• HOLAP

ROLAP is Relational OLAP and means you are storing the base data and the aggregations in relational tables and managing it using a relational database engine.

MOLAP stores both the base and aggregated data as a multi-dimensional array of data. This data must be managed by a specialised multi-dimensional database engine.

HOLAP is a hybrid between the two. The base data is held in relational tables and the aggregations in a multi-dimensional array.

Why three different ways to implement? Well, ROLAP means you can use the engine that you know and love. It is safe, secure and familiar. In addition, when OLAP started, there were no multi-dimensional database engines, so ROLAP was all we had.

MOLAP is a more modern approach. It usually generates aggregations faster and the cube it generates is generally faster to query. As a result, MOLAP is now very popular.

HOLAP can be useful where you have a massive set of base data and you find the users of the systems very often query just the aggregated data.

Now, it turns out that when Oracle is used to create an OLAP set of data it stores both the base and aggregated data in relational tables. It does not use a separate multi-dimensional array. Since we know that relational storage equals ROLAP, and we know that Oracle only does it this way, it follows that Oracle can't do MOLAP. Oracle is old fashioned, Oracle is dumb, Oracle is...

This is actually a very clever myth, one of the class where the facts as presented are all true but the inference is completely wrong. What the myth perpetrators leave out is an additional fact; namely that the Oracle database engine has, integrated within it, the code necessary to manipulate multi-dimensional arrays. That one fact enables Oracle to do something rather unusual.

So, what really is going on?

Well, the base data is, as always, a set of data stored in text and numeric fields in relational tables. The "normal" ROLAP process creates a set of aggregated values and stores them as numeric and text fields in a relational table. When Oracle creates an OLAP cube it generates the aggregations as a set of multi-dimensional arrays and stores these, not as a structure to disk, but in BLOB (Binary Large Object) fields within a relational table. The fact that Oracle creates multi-dimensional arrays as the aggregation store tells us that this is definitely MOLAP - the only oddity is where those arrays are stored.

Why does Oracle store the data in this way? Glad you asked [I rather assumed it was because Oracle wants to push Oracle licenses - Ed]. Oracle argues that in terms of performance it doesn't particularly matter where the multi-dimensional array is stored. The huge advantage of storing it in a relational table is that doing so brings the data inside the world that Oracle DBAs already understand. It is within the compass of the normal backup systems, the normal security. An Oracle DBA can maintain an OLAP database as easily as a normal one because the OLAP one IS a normal one - at least in terms of day-to-day maintenance. It is a multi-dimensional one in every other respect, including performance.

(In fact, we can now start a discussion about whether it is MOLAP or HOLAP but that discussion turns into a matter of degree - how much of the data is relational, how much multi-dimensional.) The point is that Oracle can handle true multi-dimensional data structures with perfect aplomb, the only unusual feature of Oracle's implementation is where it stores the multi-dimensional set.

So, that's one myth explored and, in this case, exploded. Oracle was the victim here, but don't feel too sorry for the company; it's big enough and robust enough to look after itself in the rough and tumble of the commercial world. Take, for example, its recent advertising campaign, which boasts the headline "Oracle Database 10g #1 on Windows". As Oracle clearly intended us to understand, the '#1' tag here refers to the version of Oracle shown (Standard Edition One). It is certainly not intended to imply that Oracle is the best selling database engine on the Windows platform.

So, happily, Larry Ellison doesn't have to say to his marketing people "That's another fine myth you've gotten me into".