Feeds

Blog: The meaning of the meaning of meaning

A somewhat less than meaningful discussion that is full of meanings

Security and trust: The backbone of doing business over the internet

My recent piece on search engines provoked this email comment from Tom Welsh (Reg Dev contributor and Senior Consultant with Cutter Consortium):

"Most people are quite unaware of the yawning gap between data and "knowledge", which is what semantics is all about. As for the term "unstructured", I concluded long ago that it is thoroughly subjective, and means whatever the given speaker finds unclear."

Tom's quite right and it set me thinking. I remember when Microsoft's XML guru told me that XML has all the semantics anyone needs; of course, by itself it has no semantics at all (try reading a piece of valid well-formed XML in Mandarin and writing a program to process it correctly) which is why we need ontologies and the semantic web and so on. Our XML guru admitted this point, once he'd thought about it a bit.

And that's the issue. Issues of Semantics are "obvious" until we think about them.

We recognise an XML invoice - in English - but if we try to process it without knowing the legal significance of some of the terms we may come a cropper. Sometimes our assumptions will be right, however, and we'll get lazy - but assumptions about semantics are still assumptions.

I know that a customer is someone we've done business with, who has a validated credit rating; but you may think of a customer as anyone who has expressed an interest in our product. Mix well-formed XML documents corresponding to both types of customer in one database and I think we may have business-level issues to resolve sooner or later - but the associated automated processing will probably work just fine, up to the point of delivering the wrong answers.

I did think of taking a three-way view in my piece - Data is just raw data; Information (data plus metadata, schema, semantics etc) is data that we understand; Knowledge is information that we can, and do, actually use for something useful. Now, does semantics come in with Information or Knowledge? It's a question of semantics (and, yes, that's a different link and includes a slightly different meaning).

But it gets messier. Even raw data can be knowledge, perhaps, if we use it for something; but it's high risk knowledge because if we are using data blindly, there's a chance that we'll use it inappropriately...

Besides, is such raw data really raw data and not just low-value information? If we know that when a figure 6 appears on the dial we must splifflicate the bifurcator (even if we nothing more about what this means), that "6" data item has become both information (we know it relates to bifurcators) and knowledge (we use it), Which gets us into knowledge quality metrics and the like...

Just don't go there....

But someone will have to...®

Security and trust: The backbone of doing business over the internet

More from The Register

next story
New 'Cosmos' browser surfs the net by TXT alone
No data plan? No WiFi? No worries ... except sluggish download speed
'Windows 9' LEAK: Microsoft's playing catchup with Linux
Multiple desktops and live tiles in restored Start button star in new vids
iOS 8 release: WebGL now runs everywhere. Hurrah for 3D graphics!
HTML 5's pretty neat ... when your browser supports it
Mathematica hits the Web
Wolfram embraces the cloud, promies private cloud cut of its number-cruncher
Google extends app refund window to two hours
You now have 120 minutes to finish that game instead of 15
Intel: Hey, enterprises, drop everything and DO HADOOP
Big Data analytics projected to run on more servers than any other app
Mozilla shutters Labs, tells nobody it's been dead for five months
Staffer's blog reveals all as projects languish on GitHub
SUSE Linux owner Attachmate gobbled by Micro Focus for $2.3bn
Merger will lead to mainframe and COBOL powerhouse
prev story

Whitepapers

Providing a secure and efficient Helpdesk
A single remote control platform for user support is be key to providing an efficient helpdesk. Retain full control over the way in which screen and keystroke data is transmitted.
WIN a very cool portable ZX Spectrum
Win a one-off portable Spectrum built by legendary hardware hacker Ben Heck
Saudi Petroleum chooses Tegile storage solution
A storage solution that addresses company growth and performance for business-critical applications of caseware archive and search along with other key operational systems.
Protecting users from Firesheep and other Sidejacking attacks with SSL
Discussing the vulnerabilities inherent in Wi-Fi networks, and how using TLS/SSL for your entire site will assure security.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.