IBM insider: How I caught my wife while bug-hunting on OS/2

No wonder that chkdsk flaw was never fixed

Intelligent flash storage arrays

Testing was a concept Microsoft struggled with

Testing OS/2 was a concept Microsoft struggled with at the time. More than once I had to deal with Redmond code that was reported to have passed and failed various quality assurance tests with complex behaviours, yet for some reason that code consisted solely of a RETF instruction, which (in theory) simply ends a subroutine call.

Their code reviews were a joke. Their developers put source code comments of the form “Skip over random IBM NLS shit” in the support for national languages, and the comment “if window count is zero return false” was next to a line that always returned zero. Microsoft flatly refused to fix this one saying that it conformed to Redmond's coding standards, a copy of which we never managed to acquire if it actually existed.

We, on the other hand, were regarded as hopelessly bureaucratic. After Microsoft lost the source code for the actual build of OS/2 we shipped, I reported a bug triggered when you double-clicked on Chkdsk twice: the program would fire up twice and both would try to fix the disk at the same time, causing corruption. I noted that this “may not be consistent with the user's goals as he sees them at this time”. This was labelled a user error, and some guy called Ballmer questioned why I had this “obsession” with perfect code.

IBM had sheets of stats on productivity and financials, the software development equivalent of AIDS. For no good reason, IBM thought software quality correlated to things like the abundance of newline characters in the source code.

So Big Blue extended its bizarre measure of productivity - the number of KLOCs, or 1,000 lines of code - to such an extent that the source code editors we used came ready with macros to bulk up your code; for example, it would extend C comments over multiple lines to make your code pass insanely dumb metrics. Suddenly, everything looked good.

Because we were starting from a largely clean base, we could do things right with OS/2. Even with the benefit of more experience and hindsight I think nearly all of the engineering decisions were the right ones and the implementation was pretty sound by the standards of the time.

IBM's Personal System/2 PC was announced at the same time as OS/2, and the computer was supposed to primarily run our operating system - but the first shipments of the hardware ended up using PC-DOS.

Most OS/2 developers at IBM and Microsoft not only didn’t use PS/2s, we weren’t even aware of their existence until too late. The parallel with Windows Surface tablets is quite striking here - a spurious marketing connection between hardware and software. Whereas PS/2s struggled to run OS/2 properly, the Surface can’t run a decent version of MS Office at all and is incompatible with Windows on purpose.

OS/2 was elegant

As OS/2 quickly evolved from an extended DOS, shared libraries and threading support were added to the mix as was the idea that the operating system's software interface - the API - should be carefully designed rather than allowed to become a mess of randomly named functions.

The API was coherent enough that you could guess the order and type of parameters because they followed a pattern without reading the documentation.

There were real arguments over the API design, though, and it was not astonishing to see a six-page change request for the name of one API call. This doesn’t sound too bad until I share that the call was eventually named WinBeep. There were even existential arguments over whether beeping should be allowed.

Still, my favourite was the SheIndicatePossibleDeath whereby the Shell (She) would signal that that the system was not well and that steps should be taken to recover or gracefully restart. The Microsoft devs thought this was hilarious and instead felt that the Trap D black screen of death was all a user needed to see in such cases.

The OS/2 Black Screen of Death

Oh, I see! Trap code D, it's so obvious what's wrong

They, of course, eventually demonstrated their superior skills by upgrading Windows NT to the blue screen of death and giving secretaries, accountants and other office users such vital information as the various memory addresses involved in the screwup so that they can patch a dodgy device driver.

So, were there any API code examples?

No, you fool. The OS/2 API documentation was programming-language neutral. Some examples of actual code was in the development kit, and it was of high quality, but it was perhaps one per cent of what needed to have been written.

Documentation was hard to maintain due to the sheer speed at which changes were made to the API. One of the times I made myself unpopular involved revealing a simple mathematical model that showed the rate of change of the system was so high that our developers could not keep up and the testers could not even write the tests for it.

This meant the project would either be late or never ever reach completion and there was no third possibility: shipping it on time. I wasn’t the only person pointing this out, but no senior IBM manager wanted to be the first to say with authority that the product would be late. I can’t believe you haven’t seen that on your own projects.

I write this article with hindsight, but be clear that I was near the bottom of the food chain and most of the decisions seemed reasonable enough to me.

In spite of this, OS/2 was easier to program than anything else you could find. We knew this for a fact and buried somewhere in IBM are the videos to prove it. IBM hired in skilled programmers from every platform and asked them to carry out various programming tasks in the usability lab and they took longer than they should have.

The problem was that the developers kept on asking “how do I do X” where X was some hack or workaround you didn’t need to do in OS/2. Mac and Windows developers actually seemed quite angry that so many of their favourite kludges were not needed.

After years of manuals that consisted of insider jokes and interesting puzzles, the Unix devs admired our documentation. The DOS programmers thought Christmas had come and wanted to come and work with us.

I tried and miserably failed to get those videos put out as part of the advertising for OS/2 which was, well, really quite like the advertising for Windows 8. You could tell a lot of money was spent, but it left no real reason in your mind to actually do anything about it. In any case like Microsoft now, IBM wanted to talk to “real people”, not those who understood computers and made the IT decisions for businesses.

Documentation is another part of the whole saga where I share in the guilt of failure. This was before I took up writing, and I could have helped the documentation team more but it was a bit dull.

I was on the inside track of the OS with which MS and IBM both wanted to rule the world; I expected to make real money from the project, so the fewer plebs who understood OS/2 programming, the better for me and IBM.

Or so we thought - look out for part two. ®

Top 5 reasons to deploy VMware with Tegile

More from The Register

next story
Netscape Navigator - the browser that started it all - turns 20
It was 20 years ago today, Marc Andreeesen taught the band to play
Sign off my IT project or I’ll PHONE your MUM
Honestly, it’s a piece of piss
Return of the Jedi – Apache reclaims web server crown
.london, .hamburg and .公司 - that's .com in Chinese - storm the web server charts
Chrome 38's new HTML tag support makes fatties FIT and SKINNIER
First browser to protect networks' bandwith using official spec
Admins! Never mind POODLE, there're NEW OpenSSL bugs to splat
Four new patches for open-source crypto libraries
Torvalds CONFESSES: 'I'm pretty good at alienating devs'
Admits to 'a metric ****load' of mistakes during work with Linux collaborators
Ploppr: The #VultureTRENDING App of the Now
This organic crowd sourced viro- social fertiliser just got REAL
prev story


Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Cloud and hybrid-cloud data protection for VMware
Learn how quick and easy it is to configure backups and perform restores for VMware environments.
Three 1TB solid state scorchers up for grabs
Big SSDs can be expensive but think big and think free because you could be the lucky winner of one of three 1TB Samsung SSD 840 EVO drives that we’re giving away worth over £300 apiece.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.