Original URL: http://www.theregister.co.uk/2006/08/10/aol_creepy_database/

Ancient satire foretold AOL's privacy disaster

Igor, to the machines! We have a sample...

By Andrew Orlowski

Posted in Media, 10th August 2006 01:40 GMT

"The Internet is becoming more and more widespread and will increasingly represent a scientific random sample of the population" - Joi Ito

One thing seems to have been forgotten following AOL's careless, but quite magnificent data dump of the internet's "hive mind" at play this week.

AOL's assiduous documentation of the private thoughts of over 600,000 web searchers has certainly added some much needed sparkle to a public internet that of late, has been in dire need of a tonic. Now, internet users' most private thoughts are revealed, in all their banality and creepiness, and we must count ourselves fortunate.

"AOL's data sketch sometimes scary picture of personalities searching Net," was the headline USA Today newspaper chose, but this barely conveyed the voyeuristic frisson, or glee we felt as the AOL database made its way across the net.

Nothing in recent months has made the net come alive quite like these queries, and it's not hard to see why. Recently, the net has been drowning in banality. Billions of identical blogs - some human generated, some machine generated - spring up every day, with identical opinions to match the identical templates each blog hoster seems to provide. This outpouring of new recorded writing has been trumpeted as a new era in human expression. But the truth is, in practice, the consequence of all this is that it's getting increasingly difficult to tell which is which. Human, or machine?

As you'll know if you've checked your email recently, the spam-generating robots, programmed to defy Bayesian filters, have become ever more creative. At the same time, the much-heralded blog army - often soccer moms who on approaching a QWERTY keyboard realize that what needs to be said, is better said elsewhere - have done little more than post pictures of cats and sunsets, good luck charms cast into the void. That's a sweet and delightful thing to do under the circumstances, but it isn't an enormous contribution to human well being, the reconfiguring of society that technology evangelists say is happening now. Or any second now.So it's nice to have some real human input for a change.

Snorer sufferers... Scar-worrying Mormons ... Adulterers and Adulteresses ... Wife-murderers ... welcome on board, and H-ello!

But back to business, and let's focus on an aspect lost in the "scandal". The thing that everyone has overlooked is that this wasn't an accidental or negligent data loss by AOL. The search query data was sincerely released in the name of science.

Boffins at AOL Labs published the data for boffins at similar "labs" to peruse.

That's strange enough in itself, and it should make you yearn for white-coated frontiersmen of yore. Things have changed a bit since then and now.

Behold: The Mighty Atom

Fifty years ago, scientists did things like, oh... split the atom, and deduce the shape of the DNA double helix. Today, working off the hottest and freshest evidence available, scientists proclaim breakthroughs such as "People get more drunk at weekends".

Once upon a time scientists set out to describe the unknown, and make it understood in mechanical terms. But now, like a group of well meaning, but slightly simple lifelong in-patients making their first tentative steps into the real world, they venture out to find what's on their doorsteps.

Now, if science is to have any useful purpose in society, it's in describing the unknown, not the bleeding obvious. No wonder it has gotten such a bad name recently.

How much easier it would have been, we suggest, if internet companies had from the outset sought what they would eventually publish anyway: the boring and creepy things that people type into their computers.

If good jokes can tell us truths that are otherwise unmentionable, then perhaps satire can offer us a glimpse of the future that futurologists dare not mention.

In fact, it just has.

AOL's privacy fiasco was foretold by a splendid prank some Register readers will recall from a few years ago, which looks even better this week than it did at the time.

Brian Del Vecchio created a spoof site called AIMSearch, announcing it with the following:

"In November of 2001 AOL Time Warner, responding to a subpoena from Attorney General John Ashcroft, made available to the Justice Department a complete archive of all private conversations held over AOL Instant Messenger (AIM). Through the power of the Freedom of Information Act (FOIA), Google was able to obtain a copy of this entire logfile, totaling over 2 terabytes of conversations previously thought to be private."

Then came the kicker:

"This unique resource provides insight into the minds of potential anti-American terrorists, cheating spouses, and countless computer neophytes."

And so we have it. Within a fortnight, Google had objected to the misuse of its trademark, and the prank ceased.

AIMSearch - your private thoughts recorded, and published

But how much easier it would have been, for this new generation of "scientific" sociologists, who thanks to data sets like AOL's query database claim to know so much more than we do about ourselves, and who place so much value on the internet's "hive mind", (cf. technology utopian Joi Ito) if it had been clear at the time who was speaking to whom.

The author of the prank, Del Vecchio told us, today -

"Back in 2002, just a few months after 9/11, we wondered how people would react if AOL were to cave to demands from the government and massively betray user privacy. We wanted them to feel that betrayal like a kick in the gut, even if just for a brief second," he wrote.

"Four years later, there is still a huge gap between the privacy that users imagine they have, and the laxity with which service providers like AOL guard that privacy. I think users like 711391 may be feeling that kick in the gut right now."

Indeed. We all are.

On the other hand, user 711391 ("christian women caught in extramarital affairs") really needs help.

And spare a thought for the fellow who typed in "how to murder your wife" dozens and dozens of times. Surely, by now, he'll be feeling some disappointment that the much vaunted internet, this fabled electronic communications medium, hasn't yet conjured forth an elite squad of Ninja Assassins to finish her off.

People will always type dark and dirty thoughts into computers - we guess that's why public, open computer networks were invented, as a kind of public sinkhole. But to turn these private writings into a basis for a new sociology seems to be a little presumptive.

In fact, taking anything that's typed into the public sinkhole seriously ought to worry us. The "murder my wife" chap is a staple of Northern folklore - he may well one day die a peaceful death having done nothing more harmful than forget to feed his cat. Meanwhile, there are law enforcement agencies, who following the same scientific principles of guesswork and presumption as the "AOL scientists", who may be keen to argue otherwise.

So if science is to devote itself to this collection of data, may we suggest it be careful. Or preferably, find more pressing issues with which to concern itself. ®