Feeds

British Library wants taxpayer to gobble the web

Cost? We don't know

Security for virtualized datacentres

British Library wants to archive the UK web, creating an invaluable national treasure trove of porn, celebrity trivia gossip and Daily Mail comments. But it admits it can't put a figure on the project - which looks like becoming a huge, open-ended commitment for the taxpayer.

Today the Library stepped up the pressure for the law to be changed, allowing copyright libraries to create copies of web material for research purposes of other copyright holders material. Five statutory libraries already have permission to make printed material available. Now the British Library says it wants the Web too.

"It's not a request for additional funding," a BL spokesperson said, but they couldn't say how much the creeping mission would end up costing us. At first, the BL won't archive every Tweet, but do an annual crawl, with some sites such as No 10 Downing Street archived more often. That would cost 220TB of data, it reckons about £4,000 in storage.

But that would barely make a dimple in a replica of UK web output, now that so many non-web chat areas have migrated to a home between angle brackets. The BL acknowledges there are eight million sites.

What, we wondered, was the point of archiving every single "Ashlee Cole iz a slag" typed into a browser?

"It may be that somebody wants to look back and research celebrity and this could be important to their research," we were told.

No doubt. But every Tweet and comment?

It was cheaper, the spokesman assured us, than employing a curator to choose between the best Ashley/Cheryl comments (for example).

Ah, right. So the mechanics dictate the curation policy.

But it was also fairer, he added, because the neutral, objective web bot couldn't be accused of bias. Even in momentous national conversations as the Cole divorce.

There are plenty of comments flying around this morning wondering why public money should be required to archive more than a handful of websites. Especially with Brewster Kahle's Archive.Org, which is privately funded.

At first the library told us the public was unaware that websites disappear without some part of the British state keeping a copy - an interesting claim. I've never met anyone who thinks all websites are preserved by some silent, omniscient backup programme.

Then the Library told us that the private sector couldn't be trusted to do the job, because future funding couldn't be assured. But with the British state in the red to the tune of £180bn this year, a defecit larger than Greece's in GDP terms (12.8 per cent), and frontline services such as nurses facing the chop, it's questionable whether anyone wants prefers to keep a copy of those Mail comments instead. ®

Beginner's guide to SSL certificates

More from The Register

next story
Bono apologises for iTunes album dump
Megalomania, generosity and FEAR of irrelevance drove group to Apple deal
Facebook, Apple: LADIES! Why not FREEZE your EGGS? It's on the company!
No biological clockwatching when you work in Silicon Valley
Doctor Who's Flatline: Cool monsters, yes, but utterly limp subplots
We know what the Doctor does, stop going on about it already
Happiness economics is bollocks. Oh, UK.gov just adopted it? Er ...
Opportunity doesn't knock; it costs us instead
'Cowardly, venomous trolls' threatened with TWO-YEAR sentences for menacing posts
UK government: 'Taking a stand against a baying cyber-mob'
Arab States make play for greater government control of the internet
Nerds told to get lost in last-minute power grab bid at UN meeting
Zippy one-liners, broken promises: Doctor Who on the Orient Express
Series finally hits stride, but Clara's U-turn is baffling
Don't bother telling people if you lose their data, say Euro bods
You read that right – with the proviso that it's encrypted
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Win a year’s supply of chocolate
There is no techie angle to this competition so we're not going to pretend there is, but everyone loves chocolate so who cares.
Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Saudi Petroleum chooses Tegile storage solution
A storage solution that addresses company growth and performance for business-critical applications of caseware archive and search along with other key operational systems.