PowerPivot: a new spin on understanding your business
The Catch 22 of business intelligence is that to understand your business, you have to deal with vendors who want to “understand your business”, better known as “see how much we can charge you”. You then have to hope that something you don’t know might be really valuable.
The most useful results are surprises not suited to the top-down approach many firms still use. A business intelligence representative once made my blood run cold by by advising a large venue to filter ticket data through the booking system to understand customers better.
Yes, that would work, and using existing systems would make it cheap and quick. But if it went wrong, the booking system might stop and refuse to work for an expensive length of time – painful when it accounts for 90 per cent of revenue.
Trawling through every record on the system to find interesting relationships can also reduce a customer-facing server to a useless crawl. That is why your IT department takes so long to “add a little report” to the system, and why the volume of data on it would swamp Excel – even modern versions with a million-row capacity.
PowerPivot is Microsoft’s push for self-service business intelligence, allowing moderately competent users to be let loose on business data without fear of real harm.
As the name suggests it extends the venerable PivotTable, which as any SQL developer will tell you groups rows that have an attribute in common. But as most Excel users and developers view any form of programming in much the same light as leprosy, you have to see this as a user tool, not something for a database pro who can code his own cubes.
What does it all mean?
As a free add-in to Excel, it seems aimed at the SME market, but as we know there are huge piles of data in large outfits that probably mean something if you could only find out what.
Although Excel 2007 eventually got past the 65K limit on rows, it is easy for sales data, for example, to sweep past the one million limit. And although the multi-threading in recent Excels makes VLookup go faster, it doesn’t make it all the way to good.
PowerPivot can go far beyond 100 times the million-row limit in Excel. It does a very good job of compressing and caching data, which means that volumes that would either crush Excel or make it unusably slow actually work at a decent speed.
Part of that is achieved by making PowerPivot its own environment hosted in Excel – not just a set of functions and wizards but with pretty much the normal Excel user interface acting as a window on large data volumes from disparate databases.
Joining data from multiple sources requires that you have some idea about keys – values that allow you to match up rows. But if you are using SQL Server, PowerPivot will copy down the relationships that are already there.
Microsoft clearly sees it as a breadcrumb trail to its server products
The important word here is “copy”; PowerPivot is a place you bring data to be analysed not a front-end to an external database.
In fact Microsoft's documentation is a bit shy about one advantage of basing business intelligence in Excel: it can suck data out of anything, which sometimes means that the only integration between large systems is user spreadsheets.
However, the data importing wizards resolutely point you at Access and SQL Server, betraying the commercial motivation in giving away a serious tool like this. Microsoft clearly sees it as a breadcrumb trail to its server products.
SQL Server using DTS and Report Services is a good multiway adapter to suck data out of diverse sources. PowerPivot’s wizards will dissect Report Services to allow you to take reports put together by central IT and alter them to cast the data your own way.
A good job has been done in not forcing the user to get bogged down in connection strings and database schema. If it can reach your desktop already, you can get it into PowerPivot without too much pain.
Of course, since you are probably doing this without the help of IT because they are too busy, it means that if they change the reports your analysis will go wrong in interesting new ways. To some extent you are trading speed of development against corporate grade bullet proofing, although end-users will probably make that trade happily.
The idea of self-service business intelligence is that you can look at the same data by different criteria, matching up data sets and aggregating the results, and since this is Excel, displaying them prettily with Sparklines (small graphs) as well as presentation graphics.
Although designed as an end-user tool, the environment has most of the attributes of a development system. It allows power users to put up Slicers that can be used to select subsets by floating filters above the data, and these filters are a lot more powerful than before.
The formulas in PowerPivot are not cell-oriented, as in core Excel, but based upon ranges that come out of relationships. So when you specify, for example, that profit in transactions is price minus cost, your new column is populated with the profit for each transaction, something that SQL users will find familiar.
This was not impossible in classical Excel, but it did require filling ranges with formulas and could get quite complex and slow.
The workflow in PowerPivot does feel quite one-way: you suck in data, join it together, slice and dice it, then produce a report application, but you have to think a little harder to drill back to data points that cause interesting or worrying results.
That being said, it does make some tricky report generation quite hard to get wrong, and since it can be mostly driven by clicking and dragging, it feels surprisingly productive.
PowerPivot is not as much of an improvement over Excel for analysing complex data as it could be. But overall it is engineered well enough to earn my highest commendation for a piece of software, which is that it has replaced some of the ways I used to do things. ®