Feeds

Multivalued datatypes considered harmful

How dangerous can a data type be?

Providing a secure and efficient Helpdesk

Increasingly developers are required to write applications that interact with database engines – typically Oracle, SQL Server, DB2, MySQL or Access. In many ways the database engine is pretty much immaterial; no matter what the flavour it’s still simply a matter of tables, columns, rows and a variety of data types; text, memo, BLOB, numeric, whatever. However if you work with Access, a completely new data type is on the horizon for 2007 – multi-valued. Unfortunately, this isn’t just-another-data-type; this is a whole different ball game and a dangerous one – more like rollerball than baseball.

As the name suggests, a multi-valued field is one in which you can place more than one value. So, imagine that you design a table to store information about, say, customers.

CUSTOMER

CustID FName LName Hobbies
1 Fred Smith Fishing, Rollerball, Hockey
2 Sally Jones Sailing
3 Brian Wilson Gliding, Sailing, Singing, Hockey

The hobbies column is a multi-valued field. In some ways this is very neat because we have created a many-to-many join (many different customers can have many different hobbies) using a single table. The alternative, and traditional way, is to use three tables.

CUSTOMER

CustID FName LName
1 Fred Smith
2 Sally Jones
3 Brian Wilson

HOBBY

HobbyID Hobbies
1 Fishing,
2 Rollerball
3 Hockey
4 Sailing
5 Gliding
6 Singing

CUSTOMER/HOBBY

CustID HobbyID
1 1
1 2
1 3
2 4
3 5
3 4
3 6
3 3

Clearly, the solution using three tables is more complex and less intuitive; so what is wrong with the multi-valued data type solution? Well, in his initial set of rules defining relational databases, Ted Codd (the originator of the relational model) forbad their use.

Rule 2, the guaranteed access rule.

Each and every datum (atomic value) in a relational data base is guaranteed to be logically accessible by resorting to a combination of table name, primary key value and column name.

If we use the table name (Customer), Primary key value (1) and column name (Hobbies) we don’t get a single atomic value (such as ‘Fishing’); we get multiple pieces of data (Fishing, Rollerball, Hockey).

Now this is a killer argument if you are a database freak like me (“If Ted Codd forbad it, I want no further truck with your multi-valued data types.”) but I quite understand that, if you are an application developer, the finer points of relational database theory often sound like just so much academic nonsense. If a new feature makes life easier, who cares if it happens to break some arbitrary rule written over 20 years ago?

Fair enough. So let’s look at an intensely practical reason why multi-valued fields are so bad. We query databases using SQL. The design of SQL is based entirely on the assumption that each column contains atomic values. If we run a normal SQL query against our single table solution:

SELECT FName FROM CUSTOMER
WHERE Hobby = “Rollerball”

It will return zero rows; despite the fact that one of our customers plays rollerball, because there is no row with a field just containing “rollerball”.

If you want a challenge, try to construct the SQL necessary to find the names of the customers who both glide and sail. It is, of course, possible, but the solution is more complex than extracting the same information from the three table structure.

And if you are not convinced that we are plunging into deep water here, imagine that I store foreign key values (in the example shown, into a PRODUCT table) in a multi valued field:

ORDER

OrderID ODate Products
1 1/1/07 1,5,5,5,5,5,67,434,434,5654
2 1/1/07 45,67,454,454,454,65556
3 2/1/07 2,454,5677

Internet Security Threat Report 2014

More from The Register

next story
Microsoft on the Threshold of a new name for Windows next week
Rebranded OS reportedly set to be flung open by Redmond
'In... 15 feet... you will be HIT BY A TRAIN' Google patents the SPLAT-NAV
Alert system tips oblivious phone junkies to oncoming traffic
Apple: SO sorry for the iOS 8.0.1 UPDATE BUNGLE HORROR
Apple kills 'upgrade'. Hey, Microsoft. You sure you want to be like these guys?
SMASH the Bash bug! Apple and Red Hat scramble for patch batches
'Applying multiple security updates is extremely difficult'
ARM gives Internet of Things a piece of its mind – the Cortex-M7
32-bit core packs some DSP for VIP IoT CPU LOL
'People have forgotten just how late the first iPhone arrived ...'
Plus: 'Google's IDEALISM is an injudicious justification for inappropriate biz practices'
Lotus Notes inventor Ozzie invents app to talk to people on your phone
Imagine that. Startup floats with voice collab app for Win iPhone
prev story

Whitepapers

Providing a secure and efficient Helpdesk
A single remote control platform for user support is be key to providing an efficient helpdesk. Retain full control over the way in which screen and keystroke data is transmitted.
Intelligent flash storage arrays
Tegile Intelligent Storage Arrays with IntelliFlash helps IT boost storage utilization and effciency while delivering unmatched storage savings and performance.
Beginner's guide to SSL certificates
De-mystify the technology involved and give you the information you need to make the best decision when considering your online security options.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.
Secure remote control for conventional and virtual desktops
Balancing user privacy and privileged access, in accordance with compliance frameworks and legislation. Evaluating any potential remote control choice.