Self Service BI: Would you, should you?

Original URL: https://www.theregister.com/2011/10/17/bi_self_service/

Reg Readers say Yes!

Posted in Software, 17th October 2011 11:11 GMT

Expert Clinic We asked last week for your expert views on the state of self-service business intelligence.

Let us set the scene: Analysts at Gartner (and Forrester, and elsewhere) have predicted that 2011 will be the year of self service business intelligence, as users demand tools to help them access and understand company data.

The users’ motives are easy to understand: identifying and responding quickly to trends and patterns in business data is vital, especially when the margins are narrow. The faster the reports can be produced, the better, and the lesser the load on IT.

But is this really happening? Now that spreadsheets can handle increasingly large datasets, Can a user really do BI from the desktop?

We asked for your input and we got it. Overwhelmingly, the answer was yes, desktop BI can be done, and indeed has been done for many years. Sarcastic remarks about the oxymoronic nature of the subject aside, concerns were raised about data quality and the wisdom of allowing users too much computational rope with which they might do themselves harm.

You didn’t all agree, though. Some made the observation that even when BI is on the desktop, the reports tend to be generated by a single “expert”, rather than being something everyone gets involved with. BI is still a specialist function, even if the specialist isn’t part of the IT department, per the pseudonymous Kubla Cant.

On to our chosen answer. We though Colin McKinnon’s view of the issue was worth repeating, particularly because of his observation that fully fledged desktop BI blurs the line between users an IT. Where, we wonder, is that line most comfortably drawn?

Colin McKinnon, Reg Reader, IT support in the financial sector by day, expert panellist by night, argues that nothing is quite that simple:

Of course they can "really do BI from the desktop" the problem is a bit more complicated.

Making tools easier to use and more integrated does not resolve the fundamental problem: in order to produce accurate information from data you need to understand the structure of the data and the effects of transformations you apply.

While we often think of relational, navigational databases, and even NoSQL datastores as essentially multi-dimensional datasets, this in itself is a simplification. All user-oriented BI systems I've had experience of stop at the multi-dimensional model. While this might be excusable when looking at high level data, it often limits the extent to which a user can drill into data and get meaningful results.

But even at a high level, there are problems. Sometimes data lacks an obvious and intrinsic hierarchy - to accommodate this, arbitrary mappings of items to nodes in a hierarchy are created. Suddenly the meaning of the information is dependant on a structure not directly derived from the data itself.

Putting the user in control of such mappings and to specific transformations of data opens the door to differing interpretations/implementations; is it better to have 10 different methods for calculating the profit on a sale or just one?

While it may seem preferable to have a single method for calculating profit at the level of units, outlets, departments and corporate level, how easy is it to quantify that for each product/service you sell? Shipping to your Alaska outlet may be significantly more than to the shop down the road. OTOH shipping costs will also vary by the size and weight of the product. It's often much simpler to apply an averaged metric - as a result different orientations of the same dataset relaying on different reference points give different values.

Even when looking at a single data item, there can be ambiguities about what it means. The value of the order was 17,532.00, but what currency? How / when do/did you apply currency conversion rates? Did this include sales tax/VAT? Discounts?

While most of these issues arise within software development - here they are expected, interpretations of data should be standardised and documented, and the transformations of data tested to ensure accuracy.

Users have had access to spreadsheets for a long time - and in my experience, the quality of the tools they produce using such spreadsheets varies greatly. I've seen millions of pounds lost by a business due to a single bug in a spreadsheet application created by a user (tool was never tested, never documented).

Actually, I think BI on the desktop is a great idea - after all, the further we can keep some users from production systems the better. Once again, from personal experience, I've seen people without extensive IT training bringing a production system to a halt by running badly behaved or inefficient applications on transactional systems.

So we can give the users tools which are easy to use, we can train them in data structures and development processes, train them how to test their applications, provide them with version control systems and document management systems....at what point do they cease to be 'users' and become 'developers'?

Dale Vile, CEO/MD, Freeform Dynamics says business intelligence began on the desktop, and is still there today:

Users have been doing BI on their desktops since the spreadsheet was invented many years ago. The problem is that the practice of extracting and downloading data from core systems then processing it offline using Excel or some other desktop tool has created as many problems as it has solved.

Whether you survey IT professionals or ask business managers, the challenges that arise when desktop tools are used to work around information and access issues are all too obvious. Lack of attention to BI-related needs results in wasted time for both users and those in IT who need to support and bail them out when things go wrong. More importantly from a business perspective, the inaccuracies and inconsistencies that arise when users do their own thing in an uncoordinated manner means decisions are not always being made on a sound basis.

With this in mind, the real question we should be asking is what’s required to allow users to do BI safely, efficiently and effectively.

We don’t have space here to go into some of the business level stuff such as KPI modelling or information governance. Suffice it to say that you need to have some idea of what users want or need to be able to do, and map this onto the various information sources that will allow them to do it. Armed with such an understanding you can then take steps to make it easier for business people to get to the right information in the right way. Things like data warehousing and/or software layers that aggregate or federate data on the fly are relevant here. The aim is to present the user with a view of information resources that is easy to navigate and is meaningful in business terms.

Whatever you use, this kind of ‘access layer’ is the key to avoiding a lot of the issues we have mentioned. There will be less reinventing of the wheel and more consistency, so when two people show up for a meeting, there will be a better chance of their numbers actually matching or complementing rather than conflicting with each other.

Another way of improving things is to allow users to share information easily before that proverbial meeting. Setting up an effective collaboration environment where files, links, and so on can be centrally stored and accessed, and even worked on collectively by multiple users, will minimise disjoints and promote better analysis and decision-making. Insights are enhanced when people compare notes and ideas in a BI context.

And coming back to the desktop, while there’s now some pretty powerful software and hardware available to users, it helps if you can better facilitate connected rather than disconnected working.

One of the tricks here is to exploit the capability of modern analytical tools, even office tools such as Excel, to work on an extract of data in local memory or storage, but flush the results of any manipulation and analysis back up to the network. Whether this is achieved through ‘live’ links to applications running in the data centre or cloud, or by checking files into and out of a document store on SharePoint or similar, the idea is to encourage BI activity to take place in a coordinated manner.

These are just some ideas for how to work more effectively in an increasingly important and challenging area. The important message is that with a joined-up approach, desktop BI becomes less of a business liability and more of an enabler of genuine business advantage.

Andrew Fryer, IT Pro Evangelist and BI Guru at Microsoft, says yes to BI on the desktop as spreadsheets become increasingly powerful, but warns IT to keep an eye on proceedings, lest the dreaded silos of data should emerge:

The simple answer to this is an emphatic yes, and this has been the case since the dawn of the spreadsheet. However capabilities were really limited back then not least of which was that we could only manipulate 9,999 rows of data. More seriously the resultant work was a siloed, snapshot and without the author, you had no idea where the source data came from or how the data had been manipulated to arrive at the final analysis. This situation got worse and worse and eventually led to some serious regulatory issues resulting in the Sarbannes Oxley (SOX) legislation in the US.

Doing some ad hoc analysis on the desktop does still have its value, but the real power kicks in when this work can be shared with others by publishing it somewhere in a form that can be easily consumed by less experienced users. In fact end-user BI like this is really powerful and done properly gives teams and departments the agility they need to do their own analysis with little direct involvement from IT.

Another thing to note about early spreadsheet use was that we used to (well I had to!) key in the data form other sources as there was no link to the source system. ODBC and now initiatives like odata have changed that, but what hasn’t changed is the need for good quality data. That might come from an internal source or from a web service for example to track social media trends. The desktop tools, specifically spreadsheets and spreadsheet add-ins then allow this data to be mashed together as a precursor to analysing it.

There is still a use case for strategic BI produced by the IT department (with end user involvement), to support the strategic direction of the enterprise, but this monolithic approach can’t keep up with end user demand for new sources of data, and new metrics to analyse it. Desktop BI will provide the personal/team/departmental BI will be created to understand how tactical decisions are supporting the overall strategy. In some case the work done could be strategic and in this case the ad hoc work can be scaled up by IT and centrally published. The key to this is that the modern spreadsheet tools will record where the data has come from allowing it to be refreshed automatically on a scheduled basis if need be. Thus the local spreadsheet becomes a workbench where all the key BI tasks, data extraction, analysis and publishing are carried out albeit at a small scale.

There has been much discussion about the post desktop/laptop era as smart devices are adopted by many CxOs and while these are great for consuming BI, the actual analysis and publishing side of BI still needs a desktop laptop.

For example there are a lot on in memory tools to allow local caches of data containing hundreds of millions of rows of data to be manipulated. The only way I can see this becoming obsolete is if all of your data was on an analytical platform running on some sort of cloud service. Today the only way I can see that working is via a remote desktop service where your virtual desktop is running on a server somewhere.

So desktop BI done properly is a valuable part of the overall BI strategy in your business but if you avoid doing this formally it will happen on an ad hoc unplanned basis and you’ll see siloed spreadsheet and end user databases (like access) popping up on your infrastructure.