Feeds

Database in depth

A plea for a return to Relational basics

Providing a secure and efficient Helpdesk

Book review Readers have probably noticed that I was a DBA (Database Administrator) in a former life and I'm an enthusiast for the Relational Model.

However, I must confess that I've always taken my Codd re-digested by Chris Date (I have two editions of his work An Introduction to Database Systems on my shelves, both of which I actually paid for).

Now there is a new book from Date, subtitled Relational Theory for Practitioners, which complements rather than replaces his other works.

Date's premise is that "theory is practical", and I'm with Date here - if theory doesn't deliver practical benefits, you need more useful theories. But I'm also reminded of a VP in a bank who once said to me "you may be right, David, but we must be practical" about some database issue (this was the same bank that thought that buying programmers a book on the theory of testing would be a waste of money).

Anyway, Date's book explores relational theory in depth. This is important, as so many RDBMS' (Relational Database Management Systems) vendors have lost contact with the underlying theory (the book documents several examples of idiocies culled from database vendor's marketing material).

If you want to argue that multi-value "post relational" databases (they're actually pre-relational, mostly derived from Pick) supersede RDBMS then you really should read and understand this book first - much of what you hate about RDBMS may be down to the (usually poor) implementation rather than to relational theory itself.

For instance, Date maintains that storing a multi-valued set of items in an RDBMS column is perfectly OK, and that a proper relational query language should be able to process both the RDBMS as a whole and this embedded set – we have, in effect, a relational database nested in a relational database.

In other words, a set, at some level of abstraction, is atomic, just as we consider a character string to be atomic, even though it can be decomposed into characters - and, of course, an arbitrary list of unrelated items stored in a convenient bucket is still an un-normalised maintenance disaster waiting to happen.

Note that I said "proper relational query language" in that last paragraph. Date is scathing about the compromises in SQL and many RDBMS implementations (as a result, he introduces Tutorial D, a query language that supports relational theory more completely and directly).

And he objects to these compromises not just for theoretical reasons, but because, for example, allowing "duplicate tuples" limits the practical capability of the optimiser to refactor and optimise queries automatically; and the introduction of "nulls" (occasionally) results in practical query results that fail the "common sense" test.

And, did you think that the difference between a base table and a view was that the base table physically exists on disk and a view doesn't; and that joins were inevitably slow?

Date claims that relational theory makes no distinction between "views" and "tables" (neither need exist on disk) and that joins have no associated property of "speed", although a poor implementation (possibly involving the necessity for putting base tables physically on disk) might be slow. None of this will be news to that "post-ralational" RDBMS vendor Caché, of course. Caché data is all stored in sparse arrays and its "relational view" (whether or not it meets Date's stringent criteria) is, at least, a purely logical view.

Date's book is not thick and it is well and entertainingly written, but it is not easy going for all that. Readers will need some familiarity with the mathematical concepts of set theory (or, at least, they'd better not find them frightening) and they'll need to be able to think clearly and logically. They'll also need to put aside their preconceptions and think critically - Codd believed that "nulls" have a place in RDBMS (as do I, for what that's worth); Date argues cogently against this; probably, neither view is "right" in absolute terms.

This all makes the effort you'll put into reading this book worthwhile, but it isn't anything like a Wikipedi summary of RDBMS. I'm not sure I understand its implications fully on a first read but, as Mark Whitehorn put it when we discussed this book, "at least you understand that you ought to understand them".

Date provides exercises (with answers on the web here) which should make it much easier to "self-pace" your study and, unlike some of Date's other books, this really isn't just an academic textbook in my view (perhaps this is why the index isn't perfect – I found at least one reference a page out).

Finally, however, a warning. The back cover of this book promises that reading and understanding it will "set you apart and well above the competition when it comes to getting work done with a relational database". I'm afraid I seriously doubt that, for all sorts of dysfunctional reasons, which wouldn't stop me reading it.

It will make you better informed and wiser, and I'll listen more attentively when you tell me that multi-valued databases are the way to go if you've read and unstood it, but true relational support is so rare these days that knowledge of relational theory may actually be a practical hindrance in some companies.

While the superior man is worrying about relational issues, the hack will be cobbling together something which just about works, mostly and for now. Which may be all that his employer wants – until things go wrong and they wish they'd used someone who actually understood database theory (or at least someone who knew what they didn't know). As usual, "doing it right" is easiest in a mature organisation with enlightened and informed managers – and how common are these today?

Database in Depth

database in depth

Verdict: If you are a DBA or claim to understand databases, then I think you should read and digest this book. You may not agree with all the points Date makes, but you should be able to follow his arguments. And, it is actually pretty readable on the surface, and provides excercises (with answers), which will help you navigate its depths.

Author: C J Date

Publisher: O'Reilly

ISBN: 0-596-10012-4

Media: Book

List Price: £20.95

Buy this book at Register Books at Reg Developer's special discounted price! ®

Secure remote control for conventional and virtual desktops

More from The Register

next story
Not appy with your Chromebook? Well now it can run Android apps
Google offers beta of tricky OS-inside-OS tech
New 'Cosmos' browser surfs the net by TXT alone
No data plan? No WiFi? No worries ... except sluggish download speed
Greater dev access to iOS 8 will put us AT RISK from HACKERS
Knocking holes in Apple's walled garden could backfire, says securo-chap
NHS grows a NoSQL backbone and rips out its Oracle Spine
Open source? In the government? Ha ha! What, wait ...?
Google extends app refund window to two hours
You now have 120 minutes to finish that game instead of 15
Intel: Hey, enterprises, drop everything and DO HADOOP
Big Data analytics projected to run on more servers than any other app
prev story

Whitepapers

Secure remote control for conventional and virtual desktops
Balancing user privacy and privileged access, in accordance with compliance frameworks and legislation. Evaluating any potential remote control choice.
Saudi Petroleum chooses Tegile storage solution
A storage solution that addresses company growth and performance for business-critical applications of caseware archive and search along with other key operational systems.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.
Providing a secure and efficient Helpdesk
A single remote control platform for user support is be key to providing an efficient helpdesk. Retain full control over the way in which screen and keystroke data is transmitted.