IBM insider: How I caught my wife while bug-hunting on OS/2

No wonder that chkdsk flaw was never fixed

Protecting against web application threats using SSL

Testing was a concept Microsoft struggled with

Testing OS/2 was a concept Microsoft struggled with at the time. More than once I had to deal with Redmond code that was reported to have passed and failed various quality assurance tests with complex behaviours, yet for some reason that code consisted solely of a RETF instruction, which (in theory) simply ends a subroutine call.

Their code reviews were a joke. Their developers put source code comments of the form “Skip over random IBM NLS shit” in the support for national languages, and the comment “if window count is zero return false” was next to a line that always returned zero. Microsoft flatly refused to fix this one saying that it conformed to Redmond's coding standards, a copy of which we never managed to acquire if it actually existed.

We, on the other hand, were regarded as hopelessly bureaucratic. After Microsoft lost the source code for the actual build of OS/2 we shipped, I reported a bug triggered when you double-clicked on Chkdsk twice: the program would fire up twice and both would try to fix the disk at the same time, causing corruption. I noted that this “may not be consistent with the user's goals as he sees them at this time”. This was labelled a user error, and some guy called Ballmer questioned why I had this “obsession” with perfect code.

IBM had sheets of stats on productivity and financials, the software development equivalent of AIDS. For no good reason, IBM thought software quality correlated to things like the abundance of newline characters in the source code.

So Big Blue extended its bizarre measure of productivity - the number of KLOCs, or 1,000 lines of code - to such an extent that the source code editors we used came ready with macros to bulk up your code; for example, it would extend C comments over multiple lines to make your code pass insanely dumb metrics. Suddenly, everything looked good.

Because we were starting from a largely clean base, we could do things right with OS/2. Even with the benefit of more experience and hindsight I think nearly all of the engineering decisions were the right ones and the implementation was pretty sound by the standards of the time.

IBM's Personal System/2 PC was announced at the same time as OS/2, and the computer was supposed to primarily run our operating system - but the first shipments of the hardware ended up using PC-DOS.

Most OS/2 developers at IBM and Microsoft not only didn’t use PS/2s, we weren’t even aware of their existence until too late. The parallel with Windows Surface tablets is quite striking here - a spurious marketing connection between hardware and software. Whereas PS/2s struggled to run OS/2 properly, the Surface can’t run a decent version of MS Office at all and is incompatible with Windows on purpose.

OS/2 was elegant

As OS/2 quickly evolved from an extended DOS, shared libraries and threading support were added to the mix as was the idea that the operating system's software interface - the API - should be carefully designed rather than allowed to become a mess of randomly named functions.

The API was coherent enough that you could guess the order and type of parameters because they followed a pattern without reading the documentation.

There were real arguments over the API design, though, and it was not astonishing to see a six-page change request for the name of one API call. This doesn’t sound too bad until I share that the call was eventually named WinBeep. There were even existential arguments over whether beeping should be allowed.

Still, my favourite was the SheIndicatePossibleDeath whereby the Shell (She) would signal that that the system was not well and that steps should be taken to recover or gracefully restart. The Microsoft devs thought this was hilarious and instead felt that the Trap D black screen of death was all a user needed to see in such cases.

The OS/2 Black Screen of Death

Oh, I see! Trap code D, it's so obvious what's wrong

They, of course, eventually demonstrated their superior skills by upgrading Windows NT to the blue screen of death and giving secretaries, accountants and other office users such vital information as the various memory addresses involved in the screwup so that they can patch a dodgy device driver.

So, were there any API code examples?

No, you fool. The OS/2 API documentation was programming-language neutral. Some examples of actual code was in the development kit, and it was of high quality, but it was perhaps one per cent of what needed to have been written.

Documentation was hard to maintain due to the sheer speed at which changes were made to the API. One of the times I made myself unpopular involved revealing a simple mathematical model that showed the rate of change of the system was so high that our developers could not keep up and the testers could not even write the tests for it.

This meant the project would either be late or never ever reach completion and there was no third possibility: shipping it on time. I wasn’t the only person pointing this out, but no senior IBM manager wanted to be the first to say with authority that the product would be late. I can’t believe you haven’t seen that on your own projects.

I write this article with hindsight, but be clear that I was near the bottom of the food chain and most of the decisions seemed reasonable enough to me.

In spite of this, OS/2 was easier to program than anything else you could find. We knew this for a fact and buried somewhere in IBM are the videos to prove it. IBM hired in skilled programmers from every platform and asked them to carry out various programming tasks in the usability lab and they took longer than they should have.

The problem was that the developers kept on asking “how do I do X” where X was some hack or workaround you didn’t need to do in OS/2. Mac and Windows developers actually seemed quite angry that so many of their favourite kludges were not needed.

After years of manuals that consisted of insider jokes and interesting puzzles, the Unix devs admired our documentation. The DOS programmers thought Christmas had come and wanted to come and work with us.

I tried and miserably failed to get those videos put out as part of the advertising for OS/2 which was, well, really quite like the advertising for Windows 8. You could tell a lot of money was spent, but it left no real reason in your mind to actually do anything about it. In any case like Microsoft now, IBM wanted to talk to “real people”, not those who understood computers and made the IT decisions for businesses.

Documentation is another part of the whole saga where I share in the guilt of failure. This was before I took up writing, and I could have helped the documentation team more but it was a bit dull.

I was on the inside track of the OS with which MS and IBM both wanted to rule the world; I expected to make real money from the project, so the fewer plebs who understood OS/2 programming, the better for me and IBM.

Or so we thought - look out for part two. ®

New hybrid storage solutions

More from The Register

next story
New 'Cosmos' browser surfs the net by TXT alone
No data plan? No WiFi? No worries ... except sluggish download speed
'Windows 9' LEAK: Microsoft's playing catchup with Linux
Multiple desktops and live tiles in restored Start button star in new vids
iOS 8 release: WebGL now runs everywhere. Hurrah for 3D graphics!
HTML 5's pretty neat ... when your browser supports it
Mathematica hits the Web
Wolfram embraces the cloud, promies private cloud cut of its number-cruncher
Mozilla shutters Labs, tells nobody it's been dead for five months
Staffer's blog reveals all as projects languish on GitHub
'People have forgotten just how late the first iPhone arrived ...'
Plus: 'Google's IDEALISM is an injudicious justification for inappropriate biz practices'
SUSE Linux owner Attachmate gobbled by Micro Focus for $2.3bn
Merger will lead to mainframe and COBOL powerhouse
iOS 8 Healthkit gets a bug SO Apple KILLS it. That's real healthcare!
Not fit for purpose on day of launch, says Cupertino
prev story


Secure remote control for conventional and virtual desktops
Balancing user privacy and privileged access, in accordance with compliance frameworks and legislation. Evaluating any potential remote control choice.
WIN a very cool portable ZX Spectrum
Win a one-off portable Spectrum built by legendary hardware hacker Ben Heck
Storage capacity and performance optimization at Mizuno USA
Mizuno USA turn to Tegile storage technology to solve both their SAN and backup issues.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
The next step in data security
With recent increased privacy concerns and computers becoming more powerful, the chance of hackers being able to crack smaller-sized RSA keys increases.