Pass the Lizard-proof tinfoil, I need to make a hat

  • alert
  • submit to reddit

Combat fraud and increase customer satisfaction

A story that the Conservatives “made the internet disappear” has ignited news channels today. In fact, the story demonstrates yet again how ignorant most journalists are of the basic workings of the internet - and it demonstrates how the thirst for conspiratorial thinking dominates political news.

Perhaps not surprisingly, this story comes from freelance blogger Mark Ballard, writing for twilight IT site Computer Weekly. Mark used to be a journalist here at The Register: he had to leave some years ago, but we remember him well.

Ballard reports in classic style that the Conservative Party has "erased" a bunch of old speeches from the "internet". Sky News, picking up Ballard's post, tells us how “the Tories had used something called a robot blocker to remove speeches”, echoing the crazed conspiratorial tone of the Weakly's coverage - which adds:

The erasure had the effect of hiding Conservative speeches in a secretive corner of the internet like those that shelter the military, secret services, gangsters and paedophiles.

In fact, it's simpler than that: the Tories just dumped all their old speeches off their website and put notices in their robots.txt file notifying webcrawlers ("the robots" that the Tories "blocked", if you've wandered in here from somewhere else) to do the same in their parent archives around the internet**. Mark and other scribblers at the Beeb, Sky, Guardian, BuzzFeed etc were stunned to find that this meant the speeches were no longer available at the Wayback Machine, which they had fondly imagined to be "the internet", and to serve as an imperishable archive of everything ever published online.

But this is business-as-usual. The shock and horror is generated by a misconception: that the internet in general, and the Wayback Machine (aka the library hosted by non-profit firm Archive.org) in particular, are bulletproof repositories of information. The word "archive" implies permanence – a fortress of data integrity impervious to time, war and large egos. But Archive.org is not an archive. It never has been.

We found this out nine years ago. The Wayback Machine is very fragmentary, and - of course - removes information on request. On that occasion the PR department of chip giant Intel had requested that a three-year-old interview with an engineer be removed. Wayback complied – as it does every time. After all, it seldom has any right to copy and republish content scraped from other people's websites.

There are other reasons why it's no surprise that the Wayback crawlers comply with robots.txt files on websites: if a crawler doesn't, it is liable to be blocked by all right-thinking webmasters. Dubious bot crawlers often try to pretend not to be robots, so that they can ignore robots.txt, but this isn't simple - they are often detected by alert web admins or their systems - and that's a dangerous route to go down when you're claiming to be a reputable organisation.

So there are good bots and bad bots, on the real internet. But not on Mark Ballard's. He writes:

The bots were what made the democratization of information possible. It was bots that inspired Cameron and Osborne. It was bots that were going to free us from serfdom in the way they said we would be. Without the bots you just had pockets of power and privilege for those in the know. Without the bots you just had the same old concentration of wealth and power there had always been, since long before the Internet Archive started taking snapshots of the Conservative website in 1999.

Knockabout stuff. And the Tories taking all their old promises off their website - and updating their robots.txt to reflect this - would almost be a small piece of news on a slow day, though all the "erasing the internet" and "criminals and paedophiles" foolery is absurd.

Except it would only be fair to note that the Labour Party has "erased the internet" too. Labour’s housecleaning has removed almost everything prior to the start of the current leadership, and the Wayback Machine (sorry, "the internet") is pretty empty of Labour's past as well as that of the Tories.

So, not really even news: "political parties scrub away all their old promises as long run-up to election begins". Boring.

If we’re to hold politicians to account then this means proper, rational debate. That means, yes, keeping a record of what they say - but you're better off taking your own copies than relying on robots and the varied cloud systems they serve to do it for you - and then complaining that someone has "erased the internet" like a "criminal paedophile" when you are let down.

But people would normally much prefer a conspiracy to blame. Once it was the Right that dealt largely in the language conspiracy theories, as Democrat historian Richard Hofstader wrote in his famous 1964 essay The Paranoid Style in American Politics. There were Reds to be found under every Bed. But listen to any academic venting about “neoliberalism” and you can just as easily erase the word “neoliberal” and substitute the word “Illuminati”. It's escapism and a form of narcissism, really.

Every conspiracy theorist finds the conspiracy their heart desires, eventually. Readers may be interested to note that when Mark left the Reg, he took with him several folders of notes, and left behind quite a lot more.

The ones he took with him were all labelled "Military Industrial Complex", followed by sequential serial numbers. ®

*Headlines to which the answer is no.

** Newcomers might care to have a look here to learn more about the robot exclusion protocol, which "is not intended for access control, so don't try to use it as such. Think of it as a 'No Entry' sign, not a locked door."

SANS - Survey on application security programs

More from The Register

next story
Android engineer: We DIDN'T copy Apple OR follow Samsung's orders
Veep testifies for Samsung during Apple patent trial
MtGox chief Karpelès refuses to come to US for g-men's grilling
Bitcoin baron says he needs another lawyer for FinCEN chat
Did a date calculation bug just cost hard-up Co-op Bank £110m?
And just when Brit banking org needs £400m to stay afloat
One year on: diplomatic fail as Chinese APT gangs get back to work
Mandiant says past 12 months shows Beijing won't call off its hackers
Don't let no-hire pact suit witnesses call Steve Jobs a bullyboy, plead Apple and Google
'Irrelevant' character evidence should be excluded – lawyers
EFF: Feds plan to put 52 MILLION FACES into recognition database
System would identify faces as part of biometrics collection
Ex-Tony Blair adviser is new top boss at UK spy-hive GCHQ
Robert Hannigan to replace Sir Iain Lobban in the autumn
Alphadex fires back at British Gas with overcharging allegation
Brit colo outfit says it paid for 347KVA, has been charged for 1940KVA
Jack the RIPA: Blighty cops ignore law, retain innocents' comms data
Prime minister: Nothing to see here, go about your business
Banks slap Olympus with £160 MEEELLION lawsuit
Scandal hit camera maker just can't shake off its past
prev story


Designing a defence for mobile apps
In this whitepaper learn the various considerations for defending mobile applications; from the mobile application architecture itself to the myriad testing technologies needed to properly assess mobile applications risk.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.
Five 3D headsets to be won!
We were so impressed by the Durovis Dive headset we’ve asked the company to give some away to Reg readers.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Securing web applications made simple and scalable
In this whitepaper learn how automated security testing can provide a simple and scalable way to protect your web applications.