Feeds

Archive.org suffers Fahrenheit 911 memory loss

Online fire extinguished

  • alert
  • submit to reddit

Security for virtualized datacentres

Opinion You don't often think about libraries in terms of strength. Few mayors tout the large sack of the local book depository or put it up against a massive skyscraper during PR stunts. Libraries are pretty passive creatures that receive some credit for the quantity of volumes they hold but not much credit these days for being powerful entities.

That is until you run across something like Archive.org. For where the Library of Congress exudes strength, Archive.org piddles weakness. The site is really a reminder of how not far the Internet has come and how strong some old traditions really are.

The supposed Internet archiving site is not a passive entity at all. It doesn't simply collect more and more data for the use of researchers as it claims. Instead, Archive.org actively engages in odd publicity stunts and actively pulls down information. What could be weaker than a media-hungry library with disappearing material?

On Wednesday, Archive.org put up a copy of Michael Moore's Fahrenheit 911 documentary for download. The site was apparently responding to an interview in which Moore said he didn't mind people downloading the movie as long as the sites offering it didn't profit from the action. So Archive.org flexed its freedom of information/culture muscle and boldly offered the movie in a variety of formats.

An intern here in The Register's Chicago office was ordered to test the download out. It worked. Our intern - Streaming Sally - used the FreeCache technology Archive.org recommended, and the download took about 3 hours. The movie came in a bit choppy but certainly watchable - so Sally said.

But just hours after putting up the movie, Archive.org pulled it down. In the movie's place was a note that read, "This is under copyright, and archive.org needs to pull it before any damage happens."

Think of this as a child fondling a can of spray paint but then stepping away from the school wall before "any damage happens." Or a seven-year-old contemplating a ten-yard run with scissors in hand and then putting the weapon down before "any damage happens." How ever you think about it. It's clear that there are children running Archive.org - the kind that play copyright gags while doing shots of Pepsi late into the night.

We know this because Archive.org has long had a childlike relationship with information. Our first indication of this happened back in 2002. At that time, Intel has accidentally released the code-name of an upcoming project - Nehalem. One of Intel's engineers discussed the project in an interview conducted by Intel itself and posted on Intel's web site. Some schmuck of a reporter found the code-name and did a story on it.

Intel's PR machine then went into action. First, it removed the interview from its site. Then, it called Google to make sure no copies of the interviewed lurked in Google's cache. Then, it called Archive.org to remove any trace of the interview at all.

Libraries exist to preserve society’s cultural artifacts and to provide access to them. If libraries are to continue to foster education and scholarship in this era of digital technology, it’s essential for them to extend those functions into the digital world.

Open and free access to literature and other writings has long been considered essential to education and to the maintenance of an open society. Public and philanthropic enterprises have supported it through the ages.

The Internet Archive is opening its collections to researchers, historians, and scholars. The Archive has no vested interest in the discoveries of the users of its collections, nor is it a grant-making organization.

This is pretty big talk for a toddler of a library. The Intel incident is by no means the first or only time Archive.org has pulled information at a vendor or user's request. Exactly how a vendor that of its own volition posts information in a public forum can then go back and claim it's proprietary is beyond us and how a "library" can obey this request defies comprehension. We're not talking about Windows source code here, friends.

Beyond any of this, Archive.org does a poor job of recording sites - you know, the ones it doesn't erase. Response times are horrible and more often than not only a few old examples of sites exist.

Without question, an Internet library raises tricky questions. How, for example, can you archive a libelous story when both the publisher and subject agree the original must be pulled? Not the best of situations. Still, we're pretty sure Archive.org is not the caliber of organization needed to clear up these serious matters.

The upshot of all this is that we desperately need a "real" Internet archive - one that doesn't pretend to be brave for a few hours as part of some information stunt and one that doesn't delete the very records it's supposed to keep. ®

Related stories

Britain's Web presence to be saved
Vivendi spinoff takes MP3.com archive private
The Persistence of Hoax
Defacement contest likely to target Web hosting firms

Choosing a cloud hosting partner with confidence

More from The Register

next story
The 'fun-nification' of computer education – good idea?
Compulsory code schools, luvvies love it, but what about Maths and Physics?
Facebook, Apple: LADIES! Why not FREEZE your EGGS? It's on the company!
No biological clockwatching when you work in Silicon Valley
Happiness economics is bollocks. Oh, UK.gov just adopted it? Er ...
Opportunity doesn't knock; it costs us instead
Ex-US Navy fighter pilot MIT prof: Drones beat humans - I should know
'Missy' Cummings on UAVs, smartcars and dying from boredom
Yes, yes, Steve Jobs. Look what I'VE done for you lately – Tim Cook
New iPhone biz baron points to Apple's (his) greatest successes
Lords take revenge on REVENGE PORN publishers
Jilted Johns and Jennies with busy fingers face two years inside
Sysadmin with EBOLA? Gartner's issued advice to debug your biz
Start hoarding cleaning supplies, analyst firm says, and assume your team will scatter
Doctor Who's Flatline: Cool monsters, yes, but utterly limp subplots
We know what the Doctor does, stop going on about it already
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Cloud and hybrid-cloud data protection for VMware
Learn how quick and easy it is to configure backups and perform restores for VMware environments.
Three 1TB solid state scorchers up for grabs
Big SSDs can be expensive but think big and think free because you could be the lucky winner of one of three 1TB Samsung SSD 840 EVO drives that we’re giving away worth over £300 apiece.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.