National Archives and MS strike preservation deal
Shining light on the 'digital dark age'
The National Archives (TNA) and Microsoft have come to an arrangement to support long term access to digital documents.
Under a new Memorandum of Understanding (MoU), Microsoft will provide TNA with a system that combines previous versions of Windows and Office to help solve problems of managing historical records based on legacy formats.
In return, TNA will contribute its experience in digital preservation as input to the company's future product development. Ultimately, the aim is to ensure that information is retained and kept accessible in the face of rapid digital and technological change, avoiding what Microsoft calls "a new digital dark age".
TNA will make use of Microsoft's Virtual PC 2007, which makes it possible to run multiple operating systems at the same time on the same computer. This will allow it to configure any combination of Windows and Office from one PC, thereby allowing access to practically any document based on legacy Microsoft file formats.
Staff and visitors at TNA will be able to view historical information based on legacy formats in the way the author intended. TNA will also be able to make these documents more accessible by converting the information to new, open file formats.
Prior to signing the agreement, TNA's chief executive Natalie Ceeney said: "We are living in a world that is a ticking timebomb with regard to digital preservation. Public accountability is enshrined in the way we work. It is essential we have longevity in information to ensure this accountability is maintained."
Explaining that most content written today is on Microsoft platforms, she added that the partnership was only the "start of a journey" but still "a phenomenal step forward in bringing digital preservation into the mainstream".
She also said that the arrangement was not about creating digital assets: "This is not an ownership challenge but a migration and accessibility challenge."
Microsoft said the partnership reflects the efforts it has made to move away from a proprietary model. The latest releases of Office use the Open XML format, which is under the independent control of Ecma – an industry association dedicated to the standardisation of ICT systems.
"Microsoft took the step to implement XML based file formats that unlock data in documents, allowing them to be archived, restructured, aggregated and re-used in new and dynamic ways," said Gordon Frazer, UK managing director and vice president of Microsoft International.
"Our MoU with TNA will go beyond this and ensure that decisions we make in future products will meet the rigorous requirements of digital preservation. It is about tackling the potential of a digital dark age and to make sure that doesn't become a reality".
This article was originally published at Kablenet.
Kablenet's GC weekly is a free email newsletter covering the latest news and analysis of public sector technology. To register click here.
MS to blame, but not the only ones at fault
Microsoft definitely hold some blame here. If newer versions of Office could competently display old Word documents, there'd be no need for the mad scientist virtual machine solution. (I wonder if they have to buy a licence from MS for every version of DOS/Windows/Office they need to use for this? ;-) )
MS chose to rework .DOC format for every new version of Office, forcing people to upgrade, and now they get to benefit from a bit of PR fluff about how they're making it possible to read old documents. Gotta love that lock-in.
But MS aren't solely to blame, they're just one of many peddlers of closed formats over the years, and this is a perfect example of why vendor-dependent formats are trouble waiting to happen. At least these days we have a couple of options which aren't tied to any one OS, software package or vendor and stand a good chance of being readable in the future. TNA need to be converting everything they can into such formats, starting now.
The hardware most certainly *can* be an issue. I may have a 3.5" floppy drive in my current system, but no standard PC floppy controller can read the Amiga's native disk format, so anything I didn't transfer when I *had* a working Amiga is now trapped on unreadable media. Luckily none of it is important enough to jump through hoops to recover, but TNA's archive is of significantly more importance to the national record than my A-level coursework is...
There's a subset of PDF designed for just this purpose. PDF/A (for PDF Archive) is an ISO standard (19005) based on a publically available spec, and has various long-term oriented features like requiring all fonts to be embedded with their copyright information, color profiles to be embedded or otherwise well known and so on. We develop software that supports it and we're slowly seeing more interest in that aspect, particularly from government bodies, so I'd say it's here to stay.
Moons ago when I worked for organs of HMG I remember using two WP apps - Lex on a Vax and Perfect Writer on a PC (this was a Ferranti app). I'm glad to hear that MS and TNA are going to make my pontifications available for future generations :-)
I love how this sort of thing comes about as if it's some kind of fantastic new idea to preserve historical data, and as if it's actually some kind of problem that we're heading into.
It's not as if you can't go out and buy floppy disc drives these days. Hell, Google would of jumped at the chance to make software specifically designed to absorb all the information, correctly catalogue it onto giga/terra/petabyte drive arrays so all the info from the damn paper could be stored in 1/100th of the space.
Not to mention you'd be able to get a 3rd party on the books making an automated 5.25" floppy disc reader capable of loading, reading, and discarding the floppy suckers at a rate of knots.
Best quote - from BBC version
The best quote on this subject comes from the BBC's version (http://news.bbc.co.uk/1/hi/technology/6265976.stm)
"Ms Ceeney said: "If you put paper on shelves, it's pretty certain it is going to be there in a hundred years.
"If you stored something on a floppy disc just three or four years ago, you'd have a hard time finding a modern computer capable of opening it."
She's not referring to the hardware at all, if you read the article it says the hardware isn't an issue. So you seriously have to wonder why Ms Ceeney is having such a "hard time" opening a file created in 2004. Technology does move on, but that's just getting a bit carried away.