Feeds

Guidelines needed to protect anonymity

It's an information free for all

High performance access to file storage

In early August, officials at America Online released information about searches being conducted by AOL members and users of the AOL search tool. This historical data was released onto the internet by several AOL officials to demonstrate how useful such data could be for tracking patterns, uses and interest of AOL members.

The data was anonymised, with members being assigned random ID numbers instead of userid's or names, and was only online for a few days.

The New York Times demonstrated, however, how easy it was to take that anonymised data, and with a few keystrokes, determine the identity of the searcher, and their personal interests, likes and dislikes – indeed to create a profile of users from this anonymized data.

The persons responsible for the "data breach" at AOL were fired – more for a public relations problem than anything else. The case demonstrates how any database, once collected, can be misused, and the significant lack of legal protection for similar information.

Personally identifiable

Privacy laws, both in the United States and abroad generally protect the collection, dissemination and use of "personally identifiable information" of various types and classes. This includes, for example things such as identifiable banking or financial information, personal health information, credit card or payment card information, and personal communications (for example, contents of emails).

Aggregated information on the other hand is not generally afforded the same level of protection. Thus, information about trends, overall internet use, health care utilisation, overall buying patterns, and the like is generally treated as the property of the institution that creates, collects, stores or collates this information.

If it is easy to convert the aggregate information into identifiable information, it may be afforded some level of protection, or may still be treated as identifiable information.

For many companies, there is a blurring of the lines between personal information (that is information about ME) and aggregate information. So, for example, Google collects information about every single thing I look for – every search request, the contents of everything delivered, what I click on, where I go from there.

It keeps both the aggregate information (how many people buy stuff off those ads on the side) and the personal information (tell me everything YOU have looked at this month). The aggregated information is analysed, processed, sold, and used by Google to increase advertising revenue, do load balancing – all kinds of things.

The same is true of ISPs and ecommerce sites. They collect and analyse massive amounts of information about even the most intimate details about you – who you chat with, who you email, what you read, what you post, and potentially even the source, destination and length of your VoIP calls.

Unless they have agreed not to in a Terms of Service agreement, there is virtually nothing preventing them from using this data, in an aggregated and "anonymous" fashion, and very little preventing them from using it otherwise.

Governments – particularly the US government – have taken advantage of this fact to attempt to obtain massive amounts of information. For example, during the course of litigation involving the government's efforts to prohibit materials that are "harmful to minors" the US government subpoenaed from the largest search companies (Yahoo!, MSN, and Google) massive amounts of such aggregate information.

When they got the cooperation of various telephone companies to turn over massive amounts of telephone calling records (non-content information) they apparently argued that such aggregated information (in that case not anonymised) was not entitled to legal protection.

The problem is, as The New York Times learned, it is relatively easy to convert this anonymised information into pointers to learn its source.

High performance access to file storage

More from The Register

next story
Parent gabfest Mumsnet hit by SSL bug: My heart bleeds, grins hacker
Natter-board tells middle-class Britain to purée its passwords
Samsung Galaxy S5 fingerprint scanner hacked in just 4 DAYS
Sammy's newbie cooked slower than iPhone, also costs more to build
Obama allows NSA to exploit 0-days: report
If the spooks say they need it, they get it
Web data BLEEDOUT: Users to feel the pain as Heartbleed bug revealed
Vendors and ISPs have work to do updating firmware - if it's possible to fix this
Snowden-inspired crypto-email service Lavaboom launches
German service pays tribute to Lavabit
One year on: diplomatic fail as Chinese APT gangs get back to work
Mandiant says past 12 months shows Beijing won't call off its hackers
Call of Duty 'fragged using OpenSSL's Heartbleed exploit'
So it begins ... or maybe not, says one analyst
NSA denies it knew about and USED Heartbleed encryption flaw for TWO YEARS
Agency forgets it exists to protect communications, not just spy on them
prev story

Whitepapers

Securing web applications made simple and scalable
In this whitepaper learn how automated security testing can provide a simple and scalable way to protect your web applications.
Five 3D headsets to be won!
We were so impressed by the Durovis Dive headset we’ve asked the company to give some away to Reg readers.
HP ArcSight ESM solution helps Finansbank
Based on their experience using HP ArcSight Enterprise Security Manager for IT security operations, Finansbank moved to HP ArcSight ESM for fraud management.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Mobile application security study
Download this report to see the alarming realities regarding the sheer number of applications vulnerable to attack, as well as the most common and easily addressable vulnerability errors.