Australian Bureau of Statistics drops big data bubble-buster
Mindless mash-ups and hardware upgrades do not better governance make
Big Data, the shiny happy story goes, will let governments direct resources into programs that really do make a difference to the problems society faces, resulting in better services, less waste and grins on every face.
But Australia's Bureau of Statistics (ABS) has just published a Research Paper A Statistical Framework for Analysing Big Data that says there are some tricky problems that need to be overcome before Australia – and by extension other governments – can put Big Data to work.
Penned by Dr Siu-Ming Tam, the ABS's Chief Methodologist, the Research Paper identifies three “threshold challenges” before Big Data becomes truly useful. The first, Dr Tam says, is the business case. Statistics like births, deaths and marriages are, he argues, obviously useful and important. But Dr Tam says he has “heard of propositions such as 'let’s bring all the Big Data into our organisation and then figure out what we want to do with it. And to effectively do this, let’s upgrade our computer hardware, or software, because Big Data requires big data processing capabilities'”
“These propositions worry me,” Tam writes, “as they put the cart (Big Data) before the horse (business problems) and treat 'Big Data as a solution in search of a problem'”.
The second threshold challenge is “Validity of Statistical Inference”, a concept that the paper explains boils down to something akin to the problem of assuming correlation is the same as causality. Numerous diverse data sources, he says, may look like they're relevant, but the fact we can now mash up enormous data sets doesn't make them fit for purpose.
Nor does the availability of data mean it is permissible for use by National Statistical Offices (NSOs). Tam worries about provenance, bias in “found” data and how NSOs can go about acquiring data if the result is products that compete with private analytics outfits or data providers.
Tam does have a go at putting big data to work, in a model using satellite images and other data to predict crop yields in Australia. The result is beyond your correspondent's mathematical capabilities, but Tam says it represents an attempt at using diverse data sources and a Big Data approach that the ABS hopes irons out some of the kinks. The paper's free to download, so there's nothing to stop other NSOs considering his approach too. ®