The Register® — Biting the hand that feeds IT

Feeds

Hands on with MDX

Dimension and measures (yes, but can it handle cubic furlongs?)

Customer Success Testimonial: Recovery is Everything

Following our introduction to MDX (to be found here) this follow-up article is a get-you-started guide to using this powerful language to manipulate multi-dimensional data.

The basics

Relational databases store data in two-dimensional tables, a familiar concept that mimics grids of data on paper. Multi-dimensional data is fundamentally different and is stored as dimensions and measures. Dimensions describe the data: examples are Products (the items we sell), Location (where sales take place) and Time. Measures are usually numeric, like Units Sold, Sales and Budget.

We can see the principle of multi-dimensional data in a very simple two-dimensional structure. Here there are two descriptive dimensions – Products in the rows and Time in the columns. There is one numerical measure, Units Sold.

Units Sold

Jan Feb Mar Apr
Apples 125 45 78 169
Figs 74 53 89 39
Lemons 20 21 25 14
Pears 220 74 103 156

But, unlike tables in a relational database, we aren’t limited to two dimensions. Add a third – Location - and we get a 3-D cube of data.

As before the rows show Products and the columns show Time. The slices that give the cube depth show the additional Location dimension. The values held at the intersections of all three dimensions are the Units Sold. (For simplicity only four values are shown). The green cell shows the number of lemons sold in Chicago in April.

Dimensions contain members and here each dimension has four: Chicago, London, Paris and New York are the four members in the Location dimension.

A multi-dimensional data structure can have many more dimensions, and this is where any visualisation technique breaks down. We humans live in a 3-D world and imagining further dimensions is hard: however, the cube analogy is a good starting point for understanding the principle behind multi-dimensional structures. (Not entirely coincidentally, multi-dimensional OLAP structures are commonly called cubes).

Hierarchies, levels and aggregations

Dimensions are often hierarchical which enables them to hold data at different levels. A Time dimension is almost invariably hierarchical and typical levels might be years, quarters and months.

Hierarchies usually have an All level at the top to give a grand total of all data, with levels below. In this case the Year level contains members for 2006 and 2007, each of these is divided into four quarters and each quarter into three months. (Not all the months are shown.) This hierarchy lets us look at the Units Sold measure for each time period: all fig sales in a particular year, quarter or month.

During the cube-building process, data for the lowest level, called the leaf level (monthly values here but could be weekly, daily, whatever) is combined to give the higher values, quarters combine into years, and years into All. These calculated values are called aggregations.

SaaS data loss: The problem you didn’t know you had

Latest Comments

Re. YTD

For those who don't recognise the name, 'Mosha' is Mosha Pasumansky; without doubt the world's leading authority on the language. The book mentioned in the article (FastTrack to MDX) is one that Mosha and I wrote in conjunction with Robert Zare. I have learned over the years that Mosha is always right about MDX (and lots of other stuff as well) and, if you get into MDX, his blog at:

http://sqljunkies.com/WebLog/mosha/

is well worth studying.

He is, as always, quite correct that we can reduce this MDX down to two words which emphasizes still further the power of this language. So we see an analytical statement that in SQL would take serious effort being achieved with almost Zen-like ease in MDX. This is not to say that MDX is therefore better than SQL; the two have strengths in different areas; SQL for transactional querying, MDX for analytical. But it does emphasise that MDX is worth learning.

0
0

YTD

Hi Mark

You can improve the formula for computing year to date sales even more, if the Time dimension is marked as such. Then you don't need to specify Time.CurrentMember inside YTD, and the formula becomes

Sum(YTD(), Measures.[Units Sold])

Now, if we don't create this as calculated measure, but instead use utility dimension or attribute, which is the best practice for Time related calculations, this could be simplified even more:

Sum(YTD())

Two words instead of seven :)

Mosha.

0
0

Re. Interesting, if a little abrupt...

Hi David,

Well spotted! Sorry about that, there was a technical hitch and only about two thirds of the article appeared initially.

It is all there now; thanks for pointing out the error.

Mark

0
0

More from The Register

Nuke plants to rely on PDP-11 code UNTIL 2050!
Programmers and their walking sticks converge in Canada
Bjarne Again: Hallelujah for C++
Plus: Now officially OK to admit you never used STL algorithms
Interwebs taunt Sir Jony over Apple eye candy makeover
Hey Ive, Ive... add more unicorns, willya?
SCO vs. IBM battle resumes over ownership of Unix
Zombie lawsuit back and wants to suck the brains out of Linux
Red Hat to ditch MySQL for MariaDB in RHEL 7
So long, Oracle! Don't let the door hit you on the way out
Shy? Socially inadequate? Fiddling with your phone could help
App 'tells the brutal truth' about social inadequates' chatup lines
Java EE 7 melds HTML5 with enterprise apps
New release arrives with GlassFish, NetBeans support
 breaking news
'Office Facebook' firm Tibbr wants you to PAY for mobe-meetings app
Great idea. Punters won't cough for it though
 breaking news
PM Cameron calls for modern, programmable computers! (We think)
IT education musings to G8 chiefs to mystify IT industry
Apple at WWDC: Sleek new iOS, death of the big cats, pint-sized Mac Pro
CEO Cook: 'The biggest change to iOS since the introduction of the iPhone'