Feeds

Guidelines needed to protect anonymity

It's an information free for all

Next gen security for virtualised datacentres

In early August, officials at America Online released information about searches being conducted by AOL members and users of the AOL search tool. This historical data was released onto the internet by several AOL officials to demonstrate how useful such data could be for tracking patterns, uses and interest of AOL members.

The data was anonymised, with members being assigned random ID numbers instead of userid's or names, and was only online for a few days.

The New York Times demonstrated, however, how easy it was to take that anonymised data, and with a few keystrokes, determine the identity of the searcher, and their personal interests, likes and dislikes – indeed to create a profile of users from this anonymized data.

The persons responsible for the "data breach" at AOL were fired – more for a public relations problem than anything else. The case demonstrates how any database, once collected, can be misused, and the significant lack of legal protection for similar information.

Personally identifiable

Privacy laws, both in the United States and abroad generally protect the collection, dissemination and use of "personally identifiable information" of various types and classes. This includes, for example things such as identifiable banking or financial information, personal health information, credit card or payment card information, and personal communications (for example, contents of emails).

Aggregated information on the other hand is not generally afforded the same level of protection. Thus, information about trends, overall internet use, health care utilisation, overall buying patterns, and the like is generally treated as the property of the institution that creates, collects, stores or collates this information.

If it is easy to convert the aggregate information into identifiable information, it may be afforded some level of protection, or may still be treated as identifiable information.

For many companies, there is a blurring of the lines between personal information (that is information about ME) and aggregate information. So, for example, Google collects information about every single thing I look for – every search request, the contents of everything delivered, what I click on, where I go from there.

It keeps both the aggregate information (how many people buy stuff off those ads on the side) and the personal information (tell me everything YOU have looked at this month). The aggregated information is analysed, processed, sold, and used by Google to increase advertising revenue, do load balancing – all kinds of things.

The same is true of ISPs and ecommerce sites. They collect and analyse massive amounts of information about even the most intimate details about you – who you chat with, who you email, what you read, what you post, and potentially even the source, destination and length of your VoIP calls.

Unless they have agreed not to in a Terms of Service agreement, there is virtually nothing preventing them from using this data, in an aggregated and "anonymous" fashion, and very little preventing them from using it otherwise.

Governments – particularly the US government – have taken advantage of this fact to attempt to obtain massive amounts of information. For example, during the course of litigation involving the government's efforts to prohibit materials that are "harmful to minors" the US government subpoenaed from the largest search companies (Yahoo!, MSN, and Google) massive amounts of such aggregate information.

When they got the cooperation of various telephone companies to turn over massive amounts of telephone calling records (non-content information) they apparently argued that such aggregated information (in that case not anonymised) was not entitled to legal protection.

The problem is, as The New York Times learned, it is relatively easy to convert this anonymised information into pointers to learn its source.

The essential guide to IT transformation

More from The Register

next story
Goog says patch⁵⁰ your Chrome
64-bit browser loads cat vids FIFTEEN PERCENT faster!
Chinese hackers spied on investigators of Flight MH370 - report
Classified data on flight's disappearance pinched
NIST to sysadmins: clean up your SSH mess
Too many keys, too badly managed
Attack flogged through shiny-clicky social media buttons
66,000 users popped by malicious Flash fudging add-on
New twist as rogue antivirus enters death throes
That's not the website you're looking for
ISIS terror fanatics invade Diaspora after Twitter blockade
Nothing we can do to stop them, says decentralized network
prev story

Whitepapers

A new approach to endpoint data protection
What is the best way to ensure comprehensive visibility, management, and control of information on both company-owned and employee-owned devices?
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Maximize storage efficiency across the enterprise
The HP StoreOnce backup solution offers highly flexible, centrally managed, and highly efficient data protection for any enterprise.
How modern custom applications can spur business growth
Learn how to create, deploy and manage custom applications without consuming or expanding the need for scarce, expensive IT resources.
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.