Feeds

Southampton Uni shows way to a truly open web

Making Berners-Lee's vision a reality

Top 10 endpoint backup mistakes

Waiting to exhale

In effect, Berners-Lee advocates want to link up data in a eloquent and constructive way on the web using something called DBPedia as the central repository for information garnered online. And yes, chillingly for some Reg readers, that does involve using Wikipedia as a major data source. It doesn't just take a 'suck it and see' approach, however, instead it grabs "structured information" using sophisticated queries against Jimbo Wales's database and, importantly, links other datasets to Wikipedia from around the web.

In other words it's a bit like telling a web surfer that the populist, if not wholly-reliable online encyclopedia shouldn't be the only source of information. Perhaps proponents of DBPedia would be happy if the database was eventually likened to a detail-obsessed librarian who's middle name is pedant. That certainly appears to be the goal at least.

But unlocking information online remains a huge challenge, despite having a government in the UK that endorses the linked data desires of Berners-Lee, Southampton and others.

"Many research and evaluation projects in the few years of the Semantic Web technologies produced ontologies, and significant data stores, but the data, if available at all, is buried in a zip archive somewhere, rather than being accessible on the web as linked data," explained Berners-Lee back in 2006.

Christopher_Gutteridge

'PDF is an embarrassment to our species'

Currently, if public information is made available online, problems remain with the kind of data formats that are all too readily used by local government departments, academic institutions and other parts of the public sector.

"PDF is an embarrassment to our species," Gutteridge says of Adobe Software's once proprietary but now open standard for document exchange.

"PDF is a brilliant way to simulate A4 or portrait views. It was natural to create a new piece of technology to simulate the old ... But our screens are all A4 landscape yet there is this stupid insistence that the portrait way is still developed. It's a legacy thing and we haven't got around to getting rid of it yet. I've been cringing at it for the past 10 years."

The reality of course is that it's here to stay for now, even if the government is trying to shunt local authorities over to publishing data in CSV and other more open data-friendly formats.

"We can publish papers in a way that anyone can read for free without restriction, it should be open and eventually linked ... It's going to be a long uphill struggle. People are wasting massive amounts of effort by building spreadsheets in each university with the same sort of data and building custom tools," says Gutteridge.

"But you can do so much more in an open model, keeping in mind some things are still commercially sensitive and you still exercise common sense. So I don’t publish my home address or banking details in semantic form, for example ... The only real risk are the people who are used to a closed world and haven't worked out they're saying too much about themselves on Facebook."

Interestingly, all researchers at the UoS are "obliged" to make their data open. "They don't have the right to make it appear only enclosed ... We've shifted the tide, it's not perfect yet," Gutteridge explains.

He admits that the notion of a semantic web is "a challenge because you need to trust your sources".

But Gutteridge prefers to be knee-deep in code.

"Linked data is still semantic web – it's just ditching all the hard stuff. We're not abandoning it, but we're not making it the goal. Ultimately, we provide the tools. Let the politicians do the arguments."

He also concedes: "We will learn down the line that we've cocked up certain ways of doing things with linked data. It's a learning process. Things restructure all the bloody time. A renumbered building, for example, could break the linked data system. It's down to temporal, real-time data. The system's not perfect, but you've got to relax, these are the 404s of the semantic web. For it to work, it has got to work while being a bit broken." ®

A new approach to endpoint data protection

More from The Register

next story
Amazon says Hachette should lower ebook prices, pay authors more
Oh yeah ... and a 30% cut for Amazon to seal the deal
Philip K Dick 'Nazi alternate reality' story to be made into TV series
Amazon Studios, Ridley Scott firm to produce The Man in the High Castle
Nintend-OH NO! Sorry, Mario – your profits are in another castle
Red-hatted mascot, red-colored logo, red-stained finance books
Sonos AXES support for Apple's iOS4 and 5
Want to use your iThing? You can't - it's too old
Joe Average isn't worth $10 a year to Mark Zuckerberg
The Social Network deflates the PC resurgence with mobile-only usage prediction
Feel free to BONK on the TUBE, says Transport for London
Plus: Almost NOBODY uses pay-by-bonk on buses - Visa
Twitch rich as Google flicks $1bn hitch switch, claims snitch
Gameplay streaming biz and search king refuse to deny fresh gobble rumors
Stick a 4K in them: Super high-res TVs are DONE
4,000 pixels is niche now... Don't say we didn't warn you
prev story

Whitepapers

7 Elements of Radically Simple OS Migration
Avoid the typical headaches of OS migration during your next project by learning about 7 elements of radically simple OS migration.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
Solving today's distributed Big Data backup challenges
Enable IT efficiency and allow a firm to access and reuse corporate information for competitive advantage, ultimately changing business outcomes.
A new approach to endpoint data protection
What is the best way to ensure comprehensive visibility, management, and control of information on both company-owned and employee-owned devices?