Original URL: http://www.theregister.co.uk/2013/04/05/british_library_to_harvest_web/

Publishing ANYTHING on .uk? From now, Big Library gets copies

'Preserving cultural ephemera ... FOREVER'

By Kelly Fiveash

Posted in Government, 5th April 2013 11:03 GMT

On the same day that thousands of public sector bods will go on strike in a row over pay, pensions and working conditions, new regulations will come into force at midnight tonight allowing the British Library to begin scraping content from UK websites.

Under the rules - known as legal deposit - the country's biggest collector of publications produced in the UK and Ireland will start harvesting what it described as "ephemeral materials like websites" to ensure that the content is "preserved forever".

That said, the British Library indicated that the word "forever" was relatively limited, by noting that a record of life and society in 21st-century Blighty might last 50, 100 or even 200 years into the future.

As of midnight the British Library, the National Library of Scotland, the National Library of Wales, the Bodleian Libraries, Cambridge University Library and the Trinity College Library in Dublin will be granted access to every UK electronic publication.

The £3m-and-counting system, which will capture blogs, ebooks and the entire .UK web domain, follows the same principle under which print publishers are required to supply those libraries with copies of every book, magazine and newspaper published in Britain (and have been required to do so for several centuries).

"Legal deposit arrangements remain vitally important. Preserving and maintaining a record of everything that has been published provides a priceless resource for the researchers of today and the future," said culture minister Ed Vaizey.

The British Library said that the first live archiving crawl of the .UK web domain would be available to researchers by the end of this year. It said that the rules for harvesting the material had been agreed with government, the Legal Deposit Libraries and different parts of the publishing industry.

It further claimed that they had called on an efficient system for archiving digital publications that supposedly avoids placing an unreasonable burden on publishers while also apparently protecting the interests of rights-holders.

Meanwhile, British Library employees who are members of the Public & Commercial Services Union are set to down tools from 1pm today in a dispute over pay. ®