Oh My GOD! Have the TORIES ERASED THE INTERNET?*
Pass the Lizard-proof tinfoil, I need to make a hat
A story that the Conservatives “made the internet disappear” has ignited news channels today. In fact, the story demonstrates yet again how ignorant most journalists are of the basic workings of the internet - and it demonstrates how the thirst for conspiratorial thinking dominates political news.
Perhaps not surprisingly, this story comes from freelance blogger Mark Ballard, writing for twilight IT site Computer Weekly. Mark used to be a journalist here at The Register: he had to leave some years ago, but we remember him well.
Ballard reports in classic style that the Conservative Party has "erased" a bunch of old speeches from the "internet". Sky News, picking up Ballard's post, tells us how “the Tories had used something called a robot blocker to remove speeches”, echoing the crazed conspiratorial tone of the Weakly's coverage - which adds:
The erasure had the effect of hiding Conservative speeches in a secretive corner of the internet like those that shelter the military, secret services, gangsters and paedophiles.
In fact, it's simpler than that: the Tories just dumped all their old speeches off their website and put notices in their robots.txt file notifying webcrawlers ("the robots" that the Tories "blocked", if you've wandered in here from somewhere else) to do the same in their parent archives around the internet**. Mark and other scribblers at the Beeb, Sky, Guardian, BuzzFeed etc were stunned to find that this meant the speeches were no longer available at the Wayback Machine, which they had fondly imagined to be "the internet", and to serve as an imperishable archive of everything ever published online.
But this is business-as-usual. The shock and horror is generated by a misconception: that the internet in general, and the Wayback Machine (aka the library hosted by non-profit firm Archive.org) in particular, are bulletproof repositories of information. The word "archive" implies permanence – a fortress of data integrity impervious to time, war and large egos. But Archive.org is not an archive. It never has been.
We found this out nine years ago. The Wayback Machine is very fragmentary, and - of course - removes information on request. On that occasion the PR department of chip giant Intel had requested that a three-year-old interview with an engineer be removed. Wayback complied – as it does every time. After all, it seldom has any right to copy and republish content scraped from other people's websites.
There are other reasons why it's no surprise that the Wayback crawlers comply with robots.txt files on websites: if a crawler doesn't, it is liable to be blocked by all right-thinking webmasters. Dubious bot crawlers often try to pretend not to be robots, so that they can ignore robots.txt, but this isn't simple - they are often detected by alert web admins or their systems - and that's a dangerous route to go down when you're claiming to be a reputable organisation.
So there are good bots and bad bots, on the real internet. But not on Mark Ballard's. He writes:
The bots were what made the democratization of information possible. It was bots that inspired Cameron and Osborne. It was bots that were going to free us from serfdom in the way they said we would be. Without the bots you just had pockets of power and privilege for those in the know. Without the bots you just had the same old concentration of wealth and power there had always been, since long before the Internet Archive started taking snapshots of the Conservative website in 1999.
Knockabout stuff. And the Tories taking all their old promises off their website - and updating their robots.txt to reflect this - would almost be a small piece of news on a slow day, though all the "erasing the internet" and "criminals and paedophiles" foolery is absurd.
Except it would only be fair to note that the Labour Party has "erased the internet" too. Labour’s housecleaning has removed almost everything prior to the start of the current leadership, and the Wayback Machine (sorry, "the internet") is pretty empty of Labour's past as well as that of the Tories.
So, not really even news: "political parties scrub away all their old promises as long run-up to election begins". Boring.
If we’re to hold politicians to account then this means proper, rational debate. That means, yes, keeping a record of what they say - but you're better off taking your own copies than relying on robots and the varied cloud systems they serve to do it for you - and then complaining that someone has "erased the internet" like a "criminal paedophile" when you are let down.
But people would normally much prefer a conspiracy to blame. Once it was the Right that dealt largely in the language conspiracy theories, as Democrat historian Richard Hofstader wrote in his famous 1964 essay The Paranoid Style in American Politics. There were Reds to be found under every Bed. But listen to any academic venting about “neoliberalism” and you can just as easily erase the word “neoliberal” and substitute the word “Illuminati”. It's escapism and a form of narcissism, really.
Every conspiracy theorist finds the conspiracy their heart desires, eventually. Readers may be interested to note that when Mark left the Reg, he took with him several folders of notes, and left behind quite a lot more.
The ones he took with him were all labelled "Military Industrial Complex", followed by sequential serial numbers. ®
*Headlines to which the answer is no.
** Newcomers might care to have a look here to learn more about the robot exclusion protocol, which "is not intended for access control, so don't try to use it as such. Think of it as a 'No Entry' sign, not a locked door."
Sponsored: Are DLP and DTP still an issue?