Feeds

Database in depth

A plea for a return to Relational basics

Internet Security Threat Report 2014

Book review Readers have probably noticed that I was a DBA (Database Administrator) in a former life and I'm an enthusiast for the Relational Model.

However, I must confess that I've always taken my Codd re-digested by Chris Date (I have two editions of his work An Introduction to Database Systems on my shelves, both of which I actually paid for).

Now there is a new book from Date, subtitled Relational Theory for Practitioners, which complements rather than replaces his other works.

Date's premise is that "theory is practical", and I'm with Date here - if theory doesn't deliver practical benefits, you need more useful theories. But I'm also reminded of a VP in a bank who once said to me "you may be right, David, but we must be practical" about some database issue (this was the same bank that thought that buying programmers a book on the theory of testing would be a waste of money).

Anyway, Date's book explores relational theory in depth. This is important, as so many RDBMS' (Relational Database Management Systems) vendors have lost contact with the underlying theory (the book documents several examples of idiocies culled from database vendor's marketing material).

If you want to argue that multi-value "post relational" databases (they're actually pre-relational, mostly derived from Pick) supersede RDBMS then you really should read and understand this book first - much of what you hate about RDBMS may be down to the (usually poor) implementation rather than to relational theory itself.

For instance, Date maintains that storing a multi-valued set of items in an RDBMS column is perfectly OK, and that a proper relational query language should be able to process both the RDBMS as a whole and this embedded set – we have, in effect, a relational database nested in a relational database.

In other words, a set, at some level of abstraction, is atomic, just as we consider a character string to be atomic, even though it can be decomposed into characters - and, of course, an arbitrary list of unrelated items stored in a convenient bucket is still an un-normalised maintenance disaster waiting to happen.

Note that I said "proper relational query language" in that last paragraph. Date is scathing about the compromises in SQL and many RDBMS implementations (as a result, he introduces Tutorial D, a query language that supports relational theory more completely and directly).

And he objects to these compromises not just for theoretical reasons, but because, for example, allowing "duplicate tuples" limits the practical capability of the optimiser to refactor and optimise queries automatically; and the introduction of "nulls" (occasionally) results in practical query results that fail the "common sense" test.

And, did you think that the difference between a base table and a view was that the base table physically exists on disk and a view doesn't; and that joins were inevitably slow?

Date claims that relational theory makes no distinction between "views" and "tables" (neither need exist on disk) and that joins have no associated property of "speed", although a poor implementation (possibly involving the necessity for putting base tables physically on disk) might be slow. None of this will be news to that "post-ralational" RDBMS vendor Caché, of course. Caché data is all stored in sparse arrays and its "relational view" (whether or not it meets Date's stringent criteria) is, at least, a purely logical view.

Date's book is not thick and it is well and entertainingly written, but it is not easy going for all that. Readers will need some familiarity with the mathematical concepts of set theory (or, at least, they'd better not find them frightening) and they'll need to be able to think clearly and logically. They'll also need to put aside their preconceptions and think critically - Codd believed that "nulls" have a place in RDBMS (as do I, for what that's worth); Date argues cogently against this; probably, neither view is "right" in absolute terms.

This all makes the effort you'll put into reading this book worthwhile, but it isn't anything like a Wikipedi summary of RDBMS. I'm not sure I understand its implications fully on a first read but, as Mark Whitehorn put it when we discussed this book, "at least you understand that you ought to understand them".

Date provides exercises (with answers on the web here) which should make it much easier to "self-pace" your study and, unlike some of Date's other books, this really isn't just an academic textbook in my view (perhaps this is why the index isn't perfect – I found at least one reference a page out).

Finally, however, a warning. The back cover of this book promises that reading and understanding it will "set you apart and well above the competition when it comes to getting work done with a relational database". I'm afraid I seriously doubt that, for all sorts of dysfunctional reasons, which wouldn't stop me reading it.

It will make you better informed and wiser, and I'll listen more attentively when you tell me that multi-valued databases are the way to go if you've read and unstood it, but true relational support is so rare these days that knowledge of relational theory may actually be a practical hindrance in some companies.

While the superior man is worrying about relational issues, the hack will be cobbling together something which just about works, mostly and for now. Which may be all that his employer wants – until things go wrong and they wish they'd used someone who actually understood database theory (or at least someone who knew what they didn't know). As usual, "doing it right" is easiest in a mature organisation with enlightened and informed managers – and how common are these today?

Database in Depth

database in depth

Verdict: If you are a DBA or claim to understand databases, then I think you should read and digest this book. You may not agree with all the points Date makes, but you should be able to follow his arguments. And, it is actually pretty readable on the surface, and provides excercises (with answers), which will help you navigate its depths.

Author: C J Date

Publisher: O'Reilly

ISBN: 0-596-10012-4

Media: Book

List Price: £20.95

Buy this book at Register Books at Reg Developer's special discounted price! ®

Security for virtualized datacentres

More from The Register

next story
Microsoft WINDOWS 10: Seven ATE Nine. Or Eight did really
Windows NEIN skipped, tech preview due out on Wednesday
Business is back, baby! Hasta la VISTA, Win 8... Oh, yeah, Windows 9
Forget touchscreen millennials, Microsoft goes for mouse crowd
Apple: SO sorry for the iOS 8.0.1 UPDATE BUNGLE HORROR
Apple kills 'upgrade'. Hey, Microsoft. You sure you want to be like these guys?
ARM gives Internet of Things a piece of its mind – the Cortex-M7
32-bit core packs some DSP for VIP IoT CPU LOL
Microsoft on the Threshold of a new name for Windows next week
Rebranded OS reportedly set to be flung open by Redmond
Lotus Notes inventor Ozzie invents app to talk to people on your phone
Imagine that. Startup floats with voice collab app for Win iPhone
'Google is NOT the gatekeeper to the web, as some claim'
Plus: 'Pretty sure iOS 8.0.2 will just turn the iPhone into a fax machine'
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Storage capacity and performance optimization at Mizuno USA
Mizuno USA turn to Tegile storage technology to solve both their SAN and backup issues.
The next step in data security
With recent increased privacy concerns and computers becoming more powerful, the chance of hackers being able to crack smaller-sized RSA keys increases.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.
A strategic approach to identity relationship management
ForgeRock commissioned Forrester to evaluate companies’ IAM practices and requirements when it comes to customer-facing scenarios versus employee-facing ones.