Internet pioneer Vint Cerf predicts the future, fears Word-DOCALYPSE
Big data? More like big problems for our grandchildren
Big data may turn out to be a big mystery to future generations, godfather of the internet Vint Cerf has warned.
The pioneering computer scientist, who helped design the TCP/IP protocol (along with Robert Kahn) before going on to work as chief internet evangelist for Google, has claimed that spreadsheets, documents and various collections of data will be unreadable by future generations.
In an interview on Monday, Cerf illustrated the problem by discussing how his up-to-date version of Microsoft Word can't read Powerpoint files created in 1997.
"I'm not blaming Microsoft," he said. "What I'm saying is that backward compatibility is very hard to preserve over very long periods of time."
Discussing scientists who are now busily gathering massive amounts of data, he warned that unless the data recording techniques of their projects is preserved by using metadata, the information will be useless to future boffins. The problem is compounded if the research is carried out and recorded by private companies, which may go bust with the loss of all information about their methodology.
"If you don't preserve all the extra metadata, you won't know what the data means. So years from now, when you have a new theory, you won't be able to go back and look at the older data," he continued.
"We won't lose the disk, but we may lose the ability to understand the disk."
He spoke of the need for a "digital vellum that will preserve not only the bits, but a way of interpreting them as well," referring to the ancient practice of using animal skin to produce durable books or documents.
Cerf also contrasted the problems of modern data storage with the example of Pulitzer Prize-winning biographer Doris Kearns Goodwin, who visited more than 100 libraries whilst writing a book called Team of Rivals about President Lincoln and his government.
There is hope, however.
"It may be that the cloud computing environment will help a lot. It may be able to emulate older hardware on which we can run operating systems and applications," Cerf added in his chat to Computerworld. ®
Sponsored: DevOps and continuous delivery