Feeds

On data models, data types and dangerous liaisons

Context, dear boy, context

Combat fraud and increase customer satisfaction

Comment A data model is a methodology for storing, handling and manipulating data. There are lots of them around. One of the most commonly employed at present is the relational model. Brainchild of Edgar Codd, it rapidly came into favour after he published his seminal paper in 1970. Many of the popular database engines today (for example DB2, Oracle and SQL Server) are based on the relational model.

Data models are often relatively complex entities, made up of a set of interwoven checks, constraints and rules. Any one rule, on its own, may make little sense; but the entire set, working together, should produce a useful and internally consistent system.

The complexity of these models means that it often takes some time to get to grips with them. For reasons that presumably only a psychologist could explain, when some humans learn a complex system they can develop an attachment to it. A fanatical attachment. The data model that the poor unfortunate victim has learnt becomes the “one true way”. There are no longer several data models from which to choose, there is one correct data model and all of the others are, quite clearly, evil. By this point, we aren’t simply in “blue suede shoes” territory; we have entered “work of the devil” land.

This is unfortunate because, in truth, all data models have their pros and cons. This is an inevitable side effect of the interlacing set of rules that make them up. Take the relational model. It has a number of rules, systems and features (for example, referential integrity) that allow it to provide relatively high data integrity. The downside of these rules is that, every time data is added to, or modified in, the database, the rules must be deployed to check the integrity of the data. This slows the database down. So the pro also has a con.

If we move to the multi-dimensional model of data, we can build structures that are tens of thousands of times faster to query than would be possible using comparable relational model. We gain speed - but we lose many of the data integrity rules.

Any argument about which of these data models is the ‘best’ is ultimately meaningless, unless we know the context in which it will be deployed. If query processing speed is of enormous importance to the prospective end-user of the system then that is clearly going to influence which of these two data models is ‘best’ in this particular context.

This context is equally important within the structure of any data model; take data types for example. There is a huge range of possible data types out there in the big wide world of databases – integer, binary, text, BLOB, XML, multi-valued and so on.

Asking for my opinion on, for example, the multi-valued data type (MVDT) is like asking whether I approve of dynamite or not. I don’t have a view either way until I know the context.

Mr Dy-na-mi-tee

"So, Mark, you think it is appropriate to carry dynamite in a crowded street do you?" No, of course not. The urban environment provides a very poor context for explosives. What about using it in controlled conditions to remove tree-stumps from the ground? Fine. (If you are an environmentalist who loves earwigs and feels sorry for the trauma inflicted upon them by the use of dynamite in this way, I apologise in advance.)

In other words, neither dynamite nor MVDTs inherently display any “good” or “bad” qualities whatsoever. Such attributes depend entirely upon context.

Now let’s look at MVDTs in the context of Access. Microsoft has recently announced that its PC-based database engine is to acquire this data type. This is a whole new data type for Access, one which it has so far managed to do without, throughout its 14 year life span. Access is a relational database engine. (Please, let’s not get into a discussion about whether it is truly relational or merely close to relational. It is about as relational as any other engine that purports to be relational today.)

One of the rules of the relational model is that all of the data held therein is atomic. In fact, this is one of the core rules of the relational model; by which I mean that it has a huge effect on many of the other components of the model. For example SQL, the language commonly used to manipulate relational databases, is completely based around the idea that the data it manipulates will be atomic. Unless we extend the language, it does not have the tools necessary to manipulate non-atomic data; such as that found in MVDTs.

You are entirely free to think that this is a good model for data and you are equally free to think that this is a bad model for data. Your opinion (and/or mine) does not change, by one iota, the fact that the core internal rules that govern relational databases happen to forbid multiple values in a single field.

The dialect of SQL that Access uses has no specific operators to handle MVDTs. So the introduction of MVDTs into Access will, I believe, make life more difficult for users of Access. I believe that Access developers should eschew their use, as I said here [and, in fact, we were a little surprised when some readers saw this article as a generic put down of MVDTs in any context – The Eds].

Now ask me what I think of MVDTs in the context of non-relational databases such as the PICK-related products (the brain-children of Dick Pick; there is some context here). These database products, such as IBM’s U2 (and even Intersystems Cache, which effectively divorces the logical view from the underlying physical implementation and thus supports the PICK MV model, the relational model and several others at the same time), are designed to use MVDTs and have the necessary operators to manipulate the data in them - so, no problem. MVDTs are perfectly valid in this context.

The relational model is just another data model; but the complex rules that comprise it make it a poor environment for MVDTs. Each on its own is fine; but taken together, the relational model and the MVDT form a dangerous liaison.®

High performance access to file storage

More from The Register

next story
Android engineer: We DIDN'T copy Apple OR follow Samsung's orders
Veep testifies for Samsung during Apple patent trial
This time it's 'Personal': new Office 365 sub covers just two devices
Redmond also brings Office into Google's back yard
Batten down the hatches, Ubuntu 14.04 LTS due in TWO DAYS
Admins dab straining server brows in advance of Trusty Tahr's long-term support landing
Microsoft lobs pre-release Windows Phone 8.1 at devs who dare
App makers can load it before anyone else, but if they do they're stuck with it
Half of Twitter's 'active users' are SILENT STALKERS
Nearly 50% have NEVER tweeted a word
Windows XP still has 27 per cent market share on its deathbed
Windows 7 making some gains on XP Death Day
Internet-of-stuff startup dumps NoSQL for ... SQL?
NoSQL taste great at first but lacks proper nutrients, says startup cloud whiz
Windows 8.1, which you probably haven't upgraded to yet, ALREADY OBSOLETE
Pre-Update versions of new Windows version will no longer support patches
Microsoft TIER SMEAR changes app prices whether devs ask or not
Some go up, some go down, Redmond goes silent
Red Hat to ship RHEL 7 release candidate with a taste of container tech
Grab 'near-final' version of next Enterprise Linux next week
prev story

Whitepapers

Designing a defence for mobile apps
In this whitepaper learn the various considerations for defending mobile applications; from the mobile application architecture itself to the myriad testing technologies needed to properly assess mobile applications risk.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.
Five 3D headsets to be won!
We were so impressed by the Durovis Dive headset we’ve asked the company to give some away to Reg readers.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Securing web applications made simple and scalable
In this whitepaper learn how automated security testing can provide a simple and scalable way to protect your web applications.