Hands on with MDX

Dimension and measures (yes, but can it handle cubic furlongs?)

Yes, yes, but where’s the CODE?

OK, here we go. This MDX query finds the total of apple sales in London in 2007:

SELECT
{ [Products].[Apples] } ON COLUMNS,
{ [Location].[London] } ON ROWS,
FROM [FruitMDX]
WHERE [Time].[2007]

The syntax is undeniably reminiscent of SQL, using SELECT, FROM and WHERE and, like an SQL query it returns an answer. However, unlike SQL, the main use for MDX is not querying (useful as it can be for that). In Fast Track to MDX a book about using MDX in SQL server 2000, of the 21 chapters there are but two on queries. The rest cover MDX expressions.

An MDX expression is a partial statement, a small, highly adaptable and powerful code section. The power comes from the language’s ability to 'understand' multi-dimensional data, and this is a radical difference between querying relational databases using SQL and multi-dimensional databases using MDX. In relational theory (and indeed practice) rows of data are inherently unordered. The concepts of the 'next' row or the 'previous' row are meaningless, and standard SQL has no functions for locating rows by position. In a multi-dimensional world, position is vital and MDX is well equipped to manipulate data with regard to its position.

For instance, MDX has a function called CurrentMember which identifies the cell in which you are currently interested. It acts dynamically so as your focus moves, CurrentMember moves too.

The green cell in the cube above could be referred to as:

(Location.Chicago, Products.Lemons, Time.[April])

(MDX syntax insists that member names containing numbers, spaces, special characters and keywords be in square brackets).

Now that the cell has your attention it could also be referred to as:

(Location.CurrentMember, Products.CurrentMember, Time.CurrentMember)

The huge advantage of a function that understands about position in a cube, particularly with respect to time, is that it makes analytical queries very easy to construct.

Take an apparently simple question like "How many pears have we sold since the beginning of the year?" Anyone who has played the SQL will know how painful this is to code because SQL knows nothing about position in a table. (Yes, I know we can use cursors, which are invaluable, but they are not part of standard SQL). An MDX expression to do this analysis comprises merely six words and some bracketry. Wow.

The function YTD stands for 'Year To Date'. It requires a member and returns a set of members that includes the member you gave it and all previous members at that level within the same year. If you give it the member April, it returns a set containing the values for that month and for Jan, Feb and March.

We can use the CurrentMember function to identify the member to pass to the YTD function:

YTD(Time.CurrentMember)

We want to add up the values in the set returned by YTD, so:

Sum(YTD(Time.CurrentMember))

does that, and lastly we identify the measure to use:

Sum(YTD(Time.CurrentMember), Measures.[Units Sold])

That’s the completed expression. (Oh alright, it's seven words but only because the measure has a space in its name). The code above is an MDX expression that creates a calculated measure: it generates new values from existing values in the data set. Calculated measures can be used for many purposes, like working out growth as a percentage or a monetary value or sales averages for comparison over time. They’re almost infinitely flexible.

But the really clever thing about our calculated measure is that it works everywhere in the Time dimension. If you're at the month level, say September 2007, it sums the Units Sold values for January to September 2007 inclusive, and if you’re at Q2-2006, it'll total Q1 and Q2 values in 2006. Using the CurrentMember function lets the calculated measure work dynamically, returning a result wherever you are in multi-dimensional space. Clever or what? ®

Sponsored: How to determine if cloud backup is right for your servers