Thanks for all this data, UK.gov, but what on Earth does it mean?
MPs want more than just large dumps of numbers
The coalition government needs to work harder if it's to convince the public that shovelling out spades of raw data will make it an open and transparent administration, MPs have said.
The Commons Public Accounts Committee has drawn up a report called Implementing the Transparency Agenda, which welcomed Number 10's Open Data Initiative, and said many of its objectives have been met.
The initiative aims to take facts and figures on what the government actually does - from local council duties to healthcare - and publish them in a consistent way that can be analysed and understood by humans and software. Rather than dumping PDFs or TIFF scans of numbers, for instance, .cvs and .xls files can be downloaded and queried.
It's an attempt to encourage civic-minded programmers, as well as government departments, to plot the data in pretty apps for citizens curious about what their taxes are being spent on. The initiative sidesteps the need for people to write a bazillion tools to scrape information from a plethora of dull Whitehall minutes.
But MPs said the project faced challenges that can only be solved through "strong leadership" by the Cabinet Office, which is responsible for whipping government departments into taking part.
Those hurdles include ensuring interest in a data set is maintained long after its big launch into public view, to prevent the whole project from grinding to a halt.
The politicians pointed out key areas for improvement: these include cleaning up facts and figures before they are released - and improving presentation to make them easier to understand - rather than dumping "large quantities of data into the public domain". In terms of information presented well, the committee highlighted the initiative's top hits: a map of crime reports in the UK and school performance tables.
Ah – policy will protect our privacy... that's OK then
On the need to protect an individual's privacy in the rush to chuck data over the wall, MPs were placated by the government: "On the risk to personal privacy, the Cabinet Office assured us that it would set out policies and controls adequate to protect privacy in its white paper," they said.
It's not clear which white paper the committee was referring to, but the government did publish an Open Data white paper in June that declared a privacy expert will be appointed to the Data Transparency Board, established by the Prime Minister in 2010. Also privacy experts would be brought into all-sector panel discussions across Whitehall when data releases are being considered.
The government should also stamp out the defence of "commercial confidentiality" that is deployed by private companies as a way to stop the release of data, said the MPs.
And the administration must do some cost-benefit analysis, to be clear on what it costs taxpayers to release information and whether the data is of use.
The call for better cost-benefit accounting echoed a report from this year by the National Audit Office, which found wild variations in the costs for making data public, along with huge variations in what people actually found useful: those police crime stats are a big hit, but figures on spending by local authorities in excess of £25,000 are not so.
Other areas singled out by the audit office included the completeness of data sets. That was an area picked on by the committee report.
The MPs welcomed the increase in information that's now public: the number of data sets available through data.gov.uk reached 7,865 by December - up from 2,500 in January 2010. The committee reckoned the government hit 23 of its 25 commitments to the scheme for the past year.
However, MPs said it wasn't good enough to dump large quantities of raw data into the public domain. "It must be accessible, relevant and easy for us all to understand," they concluded.
The information lobbed out to the public must be "fit for purpose", MPs added. Currently there are gaps in what's being opened up - such as price and performance information for adult social care - while other info is not being presented on a consistent basis - notably info on local government.
How much for one of those pretty charts, guv'nor?
On costs, MPs reckoned the cost to local authorities for publishing data on their spending ranged from "virtually zero to £100,000 per annum", while the police crime maps had set-up costs of £300,000 and annual running costs of more than £150,000.
"The government has not yet developed a full understanding of costs and benefits of making information transparent, and so decisions on what data to make available and in what form are not yet guided by value for money considerations," the report said.
According to the Cabinet Office, the newly founded Open Data Institute will drill into "the economic and public service benefits of open data".
The institute opened in May under the leadership of worldwide-web inventor and surprise 2012 Olympics opening ceremony star turn, Sir Tim "Greatest Living Briton" Berners-Lee, with artificial-intelligence professor Nigel Shadbolt. The aim of the institute, we're told, is to turn the numbers the government releases into something tangible, through taxpayer-funded training and startups.
Also, the government doesn't know whether or not to charge for the data it publishes because it hasn't done enough research into the subject.
According to MPs, the Met Office, Ordnance Survey, Land Registry and Companies House operate as "trading funds", and depend on selling their data to raise a sizeable chunk of their annual revenue. But studies suggest there could be "considerable economic benefits from making that data available for free".
"The government … has not developed a strategic approach, and has a convoluted proposal to purchase some public data from its own trading funds and other parts of the public sector, and then make the data freely available to others," MPs said. The Cabinet Office should work with the Department for Business, Innovation and Skills to decide whether it makes more economic sense to give data away or to charge for it.
One area where the government will struggle is in forcing the private businesses that dip into the public purse to be more open - and to not hide facts and figures behind the excuse of commercial sensitivity. On this, MPs were clear: "We must be able to follow a taxpayer's pound wherever it is spent." ®
They're not serious about releasing info
When I asked under FoI for a list of websites visited by Theresa May and senders/recipients of all her emails I got the brush-off. Odd really, 'cos she wants to know all that about us.
I prefer mine tartare
If any government repackages the data in a way which makes it easier to understand, it's inevitable that they would also put their greasy spin on it. We already get their version via press releases and departmental PR - surely the point here is to get the raw data available too, so that anyone can put their own visualisation / interpretation / spin on it. Or at least, anyone who can get their head around the semantic web can.
And while hating Nu Labour and the coalition with equal venom, one has to give credit to both for running with the idea, and letting Tim Berners-Lee drive the agenda - though as anyone on the mailing lists will know, all the actual good stuff has been done by the civil service, not by politicians.
Pint, because I'm taking tomorrow off so today is technically Friday.
Re: I prefer mine tartare
I agree that we should still get dumps of the raw data but they must be in an open machine readable format. I don't care if I can read a table of numbers when there are 50Billion of them, I care that I can tell a computer to pull out the bits I want or draw a graph for me. Scans are bad, .csv is OK.
Hopefully this will all get better over time as new records are made with the idea of someone reading them again rather than the old records that were to be kept because the policy says we should do... so no need to worry about being able to read them.