Feeds

Archive.org suffers Fahrenheit 911 memory loss

Online fire extinguished

  • alert
  • submit to reddit

Combat fraud and increase customer satisfaction

Opinion You don't often think about libraries in terms of strength. Few mayors tout the large sack of the local book depository or put it up against a massive skyscraper during PR stunts. Libraries are pretty passive creatures that receive some credit for the quantity of volumes they hold but not much credit these days for being powerful entities.

That is until you run across something like Archive.org. For where the Library of Congress exudes strength, Archive.org piddles weakness. The site is really a reminder of how not far the Internet has come and how strong some old traditions really are.

The supposed Internet archiving site is not a passive entity at all. It doesn't simply collect more and more data for the use of researchers as it claims. Instead, Archive.org actively engages in odd publicity stunts and actively pulls down information. What could be weaker than a media-hungry library with disappearing material?

On Wednesday, Archive.org put up a copy of Michael Moore's Fahrenheit 911 documentary for download. The site was apparently responding to an interview in which Moore said he didn't mind people downloading the movie as long as the sites offering it didn't profit from the action. So Archive.org flexed its freedom of information/culture muscle and boldly offered the movie in a variety of formats.

An intern here in The Register's Chicago office was ordered to test the download out. It worked. Our intern - Streaming Sally - used the FreeCache technology Archive.org recommended, and the download took about 3 hours. The movie came in a bit choppy but certainly watchable - so Sally said.

But just hours after putting up the movie, Archive.org pulled it down. In the movie's place was a note that read, "This is under copyright, and archive.org needs to pull it before any damage happens."

Think of this as a child fondling a can of spray paint but then stepping away from the school wall before "any damage happens." Or a seven-year-old contemplating a ten-yard run with scissors in hand and then putting the weapon down before "any damage happens." How ever you think about it. It's clear that there are children running Archive.org - the kind that play copyright gags while doing shots of Pepsi late into the night.

We know this because Archive.org has long had a childlike relationship with information. Our first indication of this happened back in 2002. At that time, Intel has accidentally released the code-name of an upcoming project - Nehalem. One of Intel's engineers discussed the project in an interview conducted by Intel itself and posted on Intel's web site. Some schmuck of a reporter found the code-name and did a story on it.

Intel's PR machine then went into action. First, it removed the interview from its site. Then, it called Google to make sure no copies of the interviewed lurked in Google's cache. Then, it called Archive.org to remove any trace of the interview at all.

Libraries exist to preserve society’s cultural artifacts and to provide access to them. If libraries are to continue to foster education and scholarship in this era of digital technology, it’s essential for them to extend those functions into the digital world.

Open and free access to literature and other writings has long been considered essential to education and to the maintenance of an open society. Public and philanthropic enterprises have supported it through the ages.

The Internet Archive is opening its collections to researchers, historians, and scholars. The Archive has no vested interest in the discoveries of the users of its collections, nor is it a grant-making organization.

This is pretty big talk for a toddler of a library. The Intel incident is by no means the first or only time Archive.org has pulled information at a vendor or user's request. Exactly how a vendor that of its own volition posts information in a public forum can then go back and claim it's proprietary is beyond us and how a "library" can obey this request defies comprehension. We're not talking about Windows source code here, friends.

Beyond any of this, Archive.org does a poor job of recording sites - you know, the ones it doesn't erase. Response times are horrible and more often than not only a few old examples of sites exist.

Without question, an Internet library raises tricky questions. How, for example, can you archive a libelous story when both the publisher and subject agree the original must be pulled? Not the best of situations. Still, we're pretty sure Archive.org is not the caliber of organization needed to clear up these serious matters.

The upshot of all this is that we desperately need a "real" Internet archive - one that doesn't pretend to be brave for a few hours as part of some information stunt and one that doesn't delete the very records it's supposed to keep. ®

Related stories

Britain's Web presence to be saved
Vivendi spinoff takes MP3.com archive private
The Persistence of Hoax
Defacement contest likely to target Web hosting firms

3 Big data security analytics techniques

Whitepapers

Mobile application security study
Download this report to see the alarming realities regarding the sheer number of applications vulnerable to attack, as well as the most common and easily addressable vulnerability errors.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Securing web applications made simple and scalable
In this whitepaper learn how automated security testing can provide a simple and scalable way to protect your web applications.
Combat fraud and increase customer satisfaction
Based on their experience using HP ArcSight Enterprise Security Manager for IT security operations, Finansbank moved to HP ArcSight ESM for fraud management.