Feeds

Southampton Uni shows way to a truly open web

Making Berners-Lee's vision a reality

Top 5 reasons to deploy VMware with Tegile

Waiting to exhale

In effect, Berners-Lee advocates want to link up data in a eloquent and constructive way on the web using something called DBPedia as the central repository for information garnered online. And yes, chillingly for some Reg readers, that does involve using Wikipedia as a major data source. It doesn't just take a 'suck it and see' approach, however, instead it grabs "structured information" using sophisticated queries against Jimbo Wales's database and, importantly, links other datasets to Wikipedia from around the web.

In other words it's a bit like telling a web surfer that the populist, if not wholly-reliable online encyclopedia shouldn't be the only source of information. Perhaps proponents of DBPedia would be happy if the database was eventually likened to a detail-obsessed librarian who's middle name is pedant. That certainly appears to be the goal at least.

But unlocking information online remains a huge challenge, despite having a government in the UK that endorses the linked data desires of Berners-Lee, Southampton and others.

"Many research and evaluation projects in the few years of the Semantic Web technologies produced ontologies, and significant data stores, but the data, if available at all, is buried in a zip archive somewhere, rather than being accessible on the web as linked data," explained Berners-Lee back in 2006.

Christopher_Gutteridge

'PDF is an embarrassment to our species'

Currently, if public information is made available online, problems remain with the kind of data formats that are all too readily used by local government departments, academic institutions and other parts of the public sector.

"PDF is an embarrassment to our species," Gutteridge says of Adobe Software's once proprietary but now open standard for document exchange.

"PDF is a brilliant way to simulate A4 or portrait views. It was natural to create a new piece of technology to simulate the old ... But our screens are all A4 landscape yet there is this stupid insistence that the portrait way is still developed. It's a legacy thing and we haven't got around to getting rid of it yet. I've been cringing at it for the past 10 years."

The reality of course is that it's here to stay for now, even if the government is trying to shunt local authorities over to publishing data in CSV and other more open data-friendly formats.

"We can publish papers in a way that anyone can read for free without restriction, it should be open and eventually linked ... It's going to be a long uphill struggle. People are wasting massive amounts of effort by building spreadsheets in each university with the same sort of data and building custom tools," says Gutteridge.

"But you can do so much more in an open model, keeping in mind some things are still commercially sensitive and you still exercise common sense. So I don’t publish my home address or banking details in semantic form, for example ... The only real risk are the people who are used to a closed world and haven't worked out they're saying too much about themselves on Facebook."

Interestingly, all researchers at the UoS are "obliged" to make their data open. "They don't have the right to make it appear only enclosed ... We've shifted the tide, it's not perfect yet," Gutteridge explains.

He admits that the notion of a semantic web is "a challenge because you need to trust your sources".

But Gutteridge prefers to be knee-deep in code.

"Linked data is still semantic web – it's just ditching all the hard stuff. We're not abandoning it, but we're not making it the goal. Ultimately, we provide the tools. Let the politicians do the arguments."

He also concedes: "We will learn down the line that we've cocked up certain ways of doing things with linked data. It's a learning process. Things restructure all the bloody time. A renumbered building, for example, could break the linked data system. It's down to temporal, real-time data. The system's not perfect, but you've got to relax, these are the 404s of the semantic web. For it to work, it has got to work while being a bit broken." ®

Internet Security Threat Report 2014

More from The Register

next story
Facebook pays INFINITELY MORE UK corp tax than in 2012
Thanks for the £3k, Zuck. Doh! you're IN CREDIT. Guess not
DOUBLE BONK: Testy fanbois catch Apple Pay picking pockets
Users wail as tapcash transactions are duplicated
Happiness economics is bollocks. Oh, UK.gov just adopted it? Er ...
Opportunity doesn't knock; it costs us instead
YARR! Pirates walk the plank: DMCA magnets sink in Google results
Spaffing copyrighted stuff over the web? No search ranking for you
In the next four weeks, 100 people will decide the future of the web
While America tucks into Thanksgiving turkey, the world will be taking over the net
Microsoft EU warns: If you have ties to the US, Feds can get your data
European corps can't afford to get complacent while American Big Biz battles Uncle Sam
prev story

Whitepapers

Choosing cloud Backup services
Demystify how you can address your data protection needs in your small- to medium-sized business and select the best online backup service to meet your needs.
Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Storage capacity and performance optimization at Mizuno USA
Mizuno USA turn to Tegile storage technology to solve both their SAN and backup issues.