Feeds

Déjà Vista

Has writing operating systems changed since OS/2?

Boost IT visibility and business value

Comment As Vista slowly slips further into the mists of the future, I sometimes wonder if anything has really changed since I was on the losing side of the IBM-Microsoft OS/2 war. Why do we now hear of huge re-writes in a product that's supposed to be almost ready?

As a former O/S (Operating System) bug hunter, it sounds rather familiar. No matter how disciplined you are in architecture, little issues have a way of forcing you to implement wider-ranging changes, and these build up like water behind a dam. This can become a nasty techno/political problem. Managers will play bluffing games, each hoping another team will take the bullet of making the project late. So, you have managers expressing increasingly fictitious optimism, until the breaking point - when suddenly torrents of issues will flood through, with lots of relief (and a bit of schadenfreude) all round.

Then you can "fix" the project, merely by firing managers at random and allowing them to become blame sinks. It would be too cynical of me to say that the management changes we've seen reported here are the result (El Reg's take on this is here). Way too cynical - I can't be right can I?

It's hard to gauge progress in a huge project like an O/S, so management focuses on a range of statistics and capability milestones. OS/2 reached the point several times where we found bugs faster than they could be fixed. One set were caused by the compiler and processor having different ideas about valid opcodes. Thus, bugs could depend upon the revision level of the CPU, and no code review could help you. Management responded to this more than once by simply banning the finding of bugs, in a denial of bad news that would shame a Soviet spin doctor; you had to be a third level manager to "approve" a bug at one point.

Bug hunters are seen as the bad guys by individual managers, whatever the top level says (some managers think that the testing process itself actually "makes" bugs, which wouldn't be there if you didn't test - Ed]. They bring bad news. In all organisations the bringers of good news prosper over the bringers of bad. At IBM, however, us bug hunters were generally seen as valuable by the organisation as a whole, not least because we annoyed Microsoft; and, yes, blame management became a big issue, complete with its negative productivity. Microsoft has lots of smart people; "evangelists" who ride out to inspire us all with the wisdom of Microsoft ways. Ever heard of its testers? Reckon they're the best paid people in Redmond?

A tester should have a superset of the skills used by developers; but at too many firms testing is a sin bin, and paid accordingly; and it's hard to see that not being the cause of many disasters on the scale of Vista. At IBM, some of us had a simplistic model. Each line of new/fixed code put into the system has some probability of breaking it in a new way, and this probability grows with the size of code already written. So, for a given quality and size of programming team, there is a maximum size of system. Beyond that, any attempt at change is as likely to break something else as to fix a problem. This steady state is permanent and fatal; have the Vista teams hit it?

Yes, I do mean "teams" plural. The failure of OS/2 commercially was partly down to the decision to do version 1.3, and thus stop 2.0 in its tracks. 1.3 was a result of listening to customers, and was a lighter, faster, crap version, of 1.2, whereas 2.0 had loads of cool features like being 32 bit and having the ability to run Windows apps. There are competing teams within any large project – and when one version hits the sweet spot of desirability and plausibility for delivery, do you think that makes them friends?

Individual Microsoft programming units are often larger than several whole companies, and so (it seems to me) "us" may be our bit of Microsoft, and "them" another part of Microsoft; not the official Linux enemy. Also, techies are known for their bitchiness (I have no doubt that this article will prod people to point out some of my screw-ups over the last 20 years) and often bug reports and fix requests are seen as coming from malicious incompetents, not colleagues. We referred to the IBM team rewriting MS compilers as "children playing with matches" and refused point-blank to use their output; and they doubtless had equally unkind things to say about us. "Arrogant prima donnas" is probably all I can use here, without getting Reg Developer added to the banned site list.

The Vista teams must also be hitting "deadline fatigue" by now. People get increasingly cynical about the assertions made by other groups and, after a while, by their own managers, who are torn between honesty with the troops and being seen as a "good team player" by their peers and bosses. Also, in order to actually get a program out of the door, there are always compromises and fudges and things that just happen to work. More than one bug in a Microsoft product has been fixed by deleting something from the manual [that happened back in the days of MVS mainframes too, nothing changes - Ed]. We've all done this, but the more deadlines you rush for, the more "clever hacks" accumulate, and by their nature are not only undocumented, but often unknown to anyone other than the developer who bodged them.

Source code control can fray a bit in the last desperate hours - it's not unknown for the source for the version of memory manager that actually shipped to be "lost". These issues actually make net progress slower, and the sort of managers who genuinely believe that sport is a good metaphor in setting goals for teams often don't realise that trying for an unachievable target doesn't bring out the best in people. In fact, it actually digs a hole for the next wave of cannon fodder to fall into.

DRM (Digital Rights Management) doesn't help. Ever since Fred Brooks wrote The Mythical Man-Month, based on his experiences with OS/360 (and since augmented with new material), we've tried to segment large systems so that bugs don't spread. But to be effective, digital rights must be managed right across the system in a cooperative fashion. That massively increases the effort and bug count, yet as far as the customer is concerned DRM itself is the bug. A "feature" is something you'd pay money to get. DRM does not pass that test.

But I'm not in the OS-writing game any more, thankfully, so I'm looking forward to lots of feedback telling me in detail where my ideas are obsolete.®

Build a business case: developing custom apps

More from The Register

next story
KDE releases ice-cream coloured Plasma 5 just in time for summer
Melty but refreshing - popular rival to Mint's Cinnamon's still a work in progress
Leaked Windows Phone 8.1 Update specs tease details of Nokia's next mobes
New screen sizes, dual SIMs, voice over LTE, and more
Mozilla keeps its Beard, hopes anti-gay marriage troubles are now over
Plenty on new CEO's todo list – starting with Firefox's slipping grasp
Apple: We'll unleash OS X Yosemite beta on the MASSES on 24 July
Starting today, regular fanbois will be guinea pigs, it tells Reg
Another day, another Firefox: Version 31 is upon us ALREADY
Web devs, Mozilla really wants you to like this one
Secure microkernel that uses maths to be 'bug free' goes open source
Hacker-repelling, drone-protecting code will soon be yours to tweak as you see fit
Cloudy CoreOS Linux distro declares itself production-ready
Lightweight, container-happy Linux gets first Stable release
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
Why and how to choose the right cloud vendor
The benefits of cloud-based storage in your processes. Eliminate onsite, disk-based backup and archiving in favor of cloud-based data protection.
The Essential Guide to IT Transformation
ServiceNow discusses three IT transformations that can help CIO's automate IT services to transform IT and the enterprise.
Maximize storage efficiency across the enterprise
The HP StoreOnce backup solution offers highly flexible, centrally managed, and highly efficient data protection for any enterprise.