Codd almighty! How IBM cracked System R
The tech team that made relational databases a reality
System R made Codd's ideas intelligible
There was a striking disparity between Codd's ideals and his practice.
"Ted couched [his ideas] in mathematical symbolism and terminology. In his original query languages he used mathematical notation, like universal quantifiers and existential quantifiers, and he used Greek letters a lot. Things like that just give the appearance of something being very esoteric and difficult to deal with. Whereas, actually, what he was trying to do was to make queries easier to write, not harder."
It's System R that took Codd's ideas and made them intelligible, by created a simple query language. The researchers query language - initially called SQUARE - or Specifying Queries As Relational Expressions. Even SQUARE had some mathematical notations. Its successor, Structured English Query Language, was based exclusively on English words.
"The development of a language based on English keywords, which you could type on your keyboard and which you could read and understand intuitively, was a breakthrough that made it much easier for people to understand the underlying simplicity of Ted’s idea. It didn’t really make the ideas any more simple; it just made them look simple."
System R member Jim Gray in 1977
"I thought that it was a crazy idea, but it is good for researchers to work on crazy ideas and so I thought maybe something would come of it. We worked on it for about six months and concluded that we didn’t see how by changing the level of abstraction downwards that we were going to make things better. It looked like things would be a lot worse.
System R spent 1976 and 1977 testing the single user system. By 1978 and 1979 IBM was into the third phase of the project. Testers reported that while performance was slow, the woes of the hierarchical been banished: it was easier to design load and then change a database.
In their 1981 paper "A History and Evaluation of System R" (PDF), Chamberlin and his team wrote: "The performance degradation must be traded off against the advantages of normalisation in which large database tables are broken up into smaller parts to avoid redundancy, and then joined together by the view mechanism and other applications."
The SQL language was also a hit. The System R team had made an extraordinary amount of progress: SQL was compiled into machine code. Locking and concurrency issues were tackled, too.
The Berkeley Three
The relational model began to intrigue the computer science community. For those with a maths background, Codd's ideas weren't forbidding. Gray would recall:
"What happened is that the academic community found DBTG and IMS pretty complicated. It wasn’t elegant and there wasn’t a theory associated with it. You couldn’t prove theorems about it, or at least they didn’t figure out how to. And along came the relational model with query optimization, and transactions, and security. The data model was simple enough that you could state it and then start reasoning about it."
Two academics on the other side of the Bay, at Berkeley, had also read Codd's papers, and were trying to do the same thing as the System R team, and put those ideas into practice. The early description of System R had been published. The core team at Berkeley was Mike Stonebraker, a Berkeley researcher (and, like Codd and Childs, a Michigan graduate,) and Gene Wong, his professor, a small number of graduates, and just one full-time programmer.
They called their project INGRES, for INteractive GRaphics and REtrieval System. The team developed its own query language, QUEL. Through necessity, they targeted more modest systems than IBM's mainframes, running an experimental OS called Unix on a DEC PDP 1140. The source code would be made available to anyone who paid a modest fee.
Says Stonebraker: "Jim Gray has a PhD from Berkeley and was around during 1971 and part of 1972, so Eugene and I got to know him and then he went to IBM Research and joined the System R team," he recalled. "So it was mostly his doing that we would go to IBM Research in San Jose or they would come up here. So we probably met every six months, and so we knew what they were doing, they knew what we were doing."
In Stonebraker's modest words, Ingres offered an "unsupported operating system and an unsupported database and no COBOL". Nevertheless it found one significant commercial user - the New York Telephone Company, and by 1978 and 1979 the academics got serious about commercialising it.
At the time, most independent software company were private entities. For the venture capitalists, the risk was high: the risk being that IBM would destroy their newly floated business. This was the background against which Stonebraker started a company - Relational Technology Inc, in 1980 (PDF).
The System R work had been published in the IBM Systems Journal, where mathematician Ed Oates had seen it, and discussed it with an ex-IBM programmer and database developer at Ampex called Larry Ellison.
Inspired by Codd's work and the work of the IBM System R team, Ellison and Oates later co-founded a startup with with Bob Miner which they called Software Development Laboratories. By 1979, they had changed its name to Relational Software, Inc (RSI) – which became Oracle Systems Corporation in 1982, going public in 1986 and becoming known as simply as Oracle Corporation in 1995.
Sponsored: Optimizing the hybrid cloud