The blessing and the curse of Big Data

It's what infrastructure is really for

Sysadmin blog Companies more familiar with technology are more likely to use the reporting and analytics features of their software. This isn't something new, and it didn't start with computers. Computers make reporting and analytics easier, but every business needs hard data if they are to grow.

Back in the day, "reporting and analytics" were why the company had a bookkeeper or accountant. Perhaps the biggest reason for the microcomputer revolution of the early 80s was that these "personal" computers enabled complicated reports to be run against accounting data by anyone, at any time.

This was increasingly important as accounting became less about tracking money and making projections and more about regulatory compliance. Accountants were occupied with regulatory compliance concerns and didn't have time to generate reports for everyone who wanted them.

This created a feedback loop. Easy and fast access to financial data meant that companies could react on shorter time scales. Yearly planning sessions became quarterly. Quarterly reviews became monthly. Changes in sales patterns were detected earlier and earlier.

Reporting availability had real world impacts on corporate footprints. Logistics and supply chain management evolved into just in time delivery, allowing fewer, smaller warehouses. As computers became more capable and datasets more complete delivery route optimization – in addition to many other things – became possible.

The eye of the beholder

Numbers are just numbers. It's relatively easy to create a report that summarises sales or reports a list of balances owing. It's another thing altogether to present that information in a meaningful format that allows correct action to be taken.

As computers became more available changes in sales patterns were detected earlier and earlier, and many companies reacted inappropriately. When you look at a yearly sales report a lot of peaks and valleys are smoothed out. Change that to a monthly, weekly, or daily report and perfectly natural variations in sales that simply don't appear at larger scales can seem scary.

Today, interpreting various flavours of financial data represents numerous specialties all on their own. No ordinary line of business manager can be expected to look at the raw data and make rational decisions. Even executives need to be focused on the area of the company they manage; they are rarely expected to be statisticians.

Making sense of the raw data has to be done in software. This is as much because of the sheer volumes of data produced by most companies as it is because of the complexity of the information we wish to extract from that data.

Today, most financial software packages include robust reporting features. There are any number of built-in reports as well as the ability to fairly easily create custom reports. Decades of business requirements for reporting being documented mean that the folks coding financial software have a good idea of what's needed.

The right analysis at the right time in front of the right person can pay for years of infrastructure in a matter of moments

The same isn't true for all datasets. The above example of route optimization software is one where pre-canned packaged exist, but there is still a lot of custom coding work being done. The more industry-specific the dataset, the more likely it is that to really gain a competitive advantage, companies will need an in-house developer building reports and analytics.

The real value of IT

While tech magazines and nerds the world over obsess about infrastructure, it isn't infrastructure that makes (or saves) money. Infrastructure is an enabler, but the right analysis at the right time in front of the right person can pay for years of infrastructure in a matter of moments.

Let's consider a simple example. A company has an internal production IT system that is separate from its point of sales and shipping systems. When a customer's order for various widgets is produced, those widgets arrive in the bin of the shipper, along with a piece of paper that details what the items are, and their barcodes.

The shipper uses the shipping IT system (provided by the courier they contract with) to determine the shipping costs. They then scan the information about the widgets from the piece of paper provided by production into the point of sale system, adds the quantities, the cost of shipping and print off another piece of paper.

The information from that piece of paper is entered into the shipping IT system and a label is produced. Everything is packaged and put onto the shelf for the courier to pick up. That's a lot of manual entry, and it occurred for each and every order for years.

The company in question hired an in-house developer who, on a whim, pulled the data from all three IT systems and compared them. He was building a separate package that would need to talk to all three systems, so this seemed a good way to learn the different formats used by each system to present data.

It was quickly discovered that a company with $10M yearly turnover was losing over $150K a year to human error involved in manually entering information where the three IT systems did not talk to each other. A month's worth of developer and sysadmin time – call it $15K worth of person-hours – and the three systems were communicating.

The shipper's daily workload was cut to a tenth what it was, and $150K a year in human error simply vanished, because one computer system could feed the exact numbers and SKUs into the next.

This is the blessing and the curse of Big Data. Even if your datasets are small enough that you could run the relevant reports and analysis on a smartphone, it's not the hardware to run the reports that's the hard part. It's not even the software to do the analysis that's the hard part.

The hard part of IT is knowing what you want your IT to do, or what questions you want your data to answer. The value of IT isn't in the gear. It's in the experience of those who make use of it. It's knowing what questions to ask, an understanding the value that the answers can offer.

The future is data intensive

When there is talk about IT understanding the business, this is what they mean. In order to know enough to ask the right questions one needs to understand the business pretty comprehensively. They also need to understand IT well enough to know what's possible and what isn't, and how to frame discussions about data and its utility.

The Internet of Things is promising to commoditise sensors and instrumentation. A whole new world of datasets will quickly be available to us. At the same time, more and more of our IT infrastructure is becoming dynamic and programmable, usually with the moniker "software defined" something or other.

In an instrumented world we will move beyond human review of many things and more fully towards automation. Computers will react to inputs from other computers and changes will be made at speeds we mere humans can only barely understand.

Who will cross that chasm in your organisation? Will IT take the plunge and really learn about the business? Or will the business side start becoming tech-savvy, sidelining IT as mere implementers? Who will be the architect of your company's future, and if you don't have the skill internally, who do you turn to? Answers, as always, in the comments. ®




Biting the hand that feeds IT © 1998–2019