Surveying anonymity and the public good
Our survey shows researchers and the public disagree
Comment Members of the public are wary of having their data used – even anonymously – for research purposes, whilst researchers are altogether more laid back about the proposition.
That is one key conclusion of a Department of Health consultation on Additional Uses of Patient Data  (pdf), published on 1 December, which found that "about half of the general public (53%) and patients (46%) thought that identifiable data should never be used without consent while only about one in ten researchers (11%) thought this".
It further reported: "More than half of the researchers (54%) thought that patient identifiable data should sometimes be used without patient consent as long as there was review by a group such as PIAG. Lower proportions of patients (30%) and the general public (30%) agreed".
This difference may reflect little more than a long-running debate between researchers and researched. However, it may also illustrate a number of issues that are likely to figure in debate around public policy and data over the next twelve months, including the over-enthusiasm of the well-intentioned, and a growing rift between Labour and Tories on how central government should use data.
Over the last few months, we have investigated instances where individual data has been collected for research purposes without clear explanation of the purposes to which it would be put, or even any positive effort to obtain permission from the data subjects. This happened in Lincolnshire , when the Local Community Health Services (LCHS) requested intimate details of children’s behaviour and wellbeing from parents.
Public outcry followed: but when last we spoke with LCHS, little had changed. They justified the survey as it supported programmes designed to do good: they showed little empathy for individuals who might just not wish to have their data collected.
They also run a relatively intimate "Lifestyle Behaviour Review " (pdf) of year 8 pupils. Whilst the LCHS claim this is "anonymous", they have dragged their heels in response to a question about whether their survey is genuinely "anonymous" in terms of the Data Protection Act. As Data Controllers are aware, the scope of the DPA is in practice synonymous with "identifiable" data: if data items can be combined in such a way that a specific individual is identifiable, there is no need for name or address to be present for the law to apply.
The Office of National Statistics are acutely aware of this issue, when publishing small area statistics. To prevent this , they use "record swapping", where "a small sample of records are swapped with a similar record in another geographical area". They also require that "the average cell size must be greater than or equal to one" and adjust small counts appearing in any table cell.
So is the Lifestyle Review – a document that quizzes young people about their drink, drugs and sex habits – genuinely unidentifiable? Given that it carries postcode and must identify age to within a year or so, it is hard to share the LCHS conclusion that it is.
Meanwhile, the Daily Mail have been stirring up shock and horror  at the idea that the Equality and Human Rights Commission (EHRC) are planning to siphon off data from sources that include "visits to A&E departments, government surveys and the reporting of crimes to police" and place it on a "huge 'Lifestyle Database'".
Well: not quite. As the Mail later admits, "it will not be possible to identify individuals from the information on the database": but even that probably over-states the case. The database in question is an aggregated data system that El Reg was alerted to some months back.
We investigated it then and found it was little more than an online query tool designed to support the EHRC’s Equality Measurement Framework . The tool would allow researchers to pull down statistics on 10 domains of equality – such as life, health, productive activities, education, employment, etc. – broken down by group.
Although the Mail focusses on the fact that the database will attempt to hold data on sexual orientation and identity, a spokesman for the project told El Reg today that general reluctance to answer questions about sexuality meant that this tool would be most useful in respect of disability, gender and age.
So bearing in mind the caveats about identifiability expressed above, this base will contain no individual records and it is unlikely that any individual will be identifiable through it: however, data will be collected without direct consent.
In the end, there is no scandal here: just a mindset on the part of some officials that if the end is for the public good, then it doesn’t matter if the rules around data collection get slightly bent. That would appear to be joined, at the lower levels of government, by a poor understanding of the letter of the law when it comes to Data Protection. The same departments that are so obstructive when it comes to dishing out information are rather less well informed as to the scope of the Act when it comes to identifiability.
According to the Tories , they intend to shake up the lazy assumptions that public services are entitled to use our data just because they can, and require all such uses in future to be tested according to necessity. Whether anything will change – or whether the researchers will continue to win the day - remains to be seen. ®