A postcard from Intel in Lisbon
Intel says 'parallel or perish'
James Reinders says that, soon,
a programmer who doesn't
"think parallel" won't
be a programmer.
Here are Reinders' rules for multiprogramming, in my words, and as I see them:
- Think parallel first. Don't even contemplate bolting on parallel processing capabilities afterwards.
- Code to express the parallel nature of the problem. Don't write thread management code – this is the equivalent of writing in C# or Java instead of Assembler.
- Don't tie threads to particular processors. You don't want to write programs that only run properly on a particular number of cores.
- Plan to scale through increased workload. Amdahl's Law often limits the performance gain you can get from parallel processing applied to a fixed-size workload (there is usually some significant serial part of the process which can't easily be parallelised); but Gustafson observed that if you increase the workload, the serial part of the process often remains fixed and parallel processing then lets you get through the much bigger workload with similar performance.
- Only create programs which can arbitrarily add tasks to the workload, so if more processors become available, the workload can take advantage of them
- Only write programs that can run serially, mainly because (assuming that all new PCs will be multicore) they'll then be easier to debug. However, for the time being your programs will still be expected to run OK on legacy single processor machines - and remember that a program optimised for multiprocessors will usually run more slowly on a uniprocessor, so be aware of this and don't rush headlong into coding for multicore architectures.
And some of the tools which Intel thinks will help you follow these rules are:
- The OpenMP standard, which bolts efficient parallelising onto C++ and Fortran compilers using compiler hints. This is what Sutter calls "industrial strength duct tape", but it works.
- Threaded Building Blocks – C++ algorithms for scalable threading (Reinders seems very confident in this tool).
- Thread Profiler - which highlights potential performance bottlenecks.
- Thread Checker - which detects latent race conditions and potential deadlocks.
But now a note of caution. Parallel processing has always been a holy grail of computing (although Intel came to it late, perhaps). Many of the issues talked about in this conference I've met before – on multiprocessor mainframes (the most efficient way to achieve parallel processing in practice may be the mainframe job scheduler).
I've told good programmers to think about the consequences of running on multiprocessors, only to be told that "the compiler will look after it" (in general, it can't). And I've seen the results of programmers forgetting that their code can run on several processors and, in production, things may sometimes run in the wrong order as a result. This seldom shows up in test as, even if several processors are available to the test system, the chances are that you don't process enough data to see the latent race conditions, which tend to appear when the system is overloaded.
I've had to deal with the consequences of programmers deciding that they can do locks better than IBM and coding them for themselves (the application I'm thinking of was very fast – for a while, until the consequences of never releasing locks became apparent).
This stuff seems to be hard, so we're going to need very good tools and more training. And probably, much better adherence to good development process.
Do I think that parallel processing of this sort is the way of the future? Yes, emphatically, if you run on Intel or similar models it's the only way (it seems to me) to scale computer processor power effectively. Although whether we need to scale computer processor power or whether lots of specialised small computers, another kind of parallel processing, will work better, might be another question.
Reinders tried to make the point that parallelism was intuitive. His example was the queue – it's really quite intuitive that if you have a long queue, you just need more people on the desks servicing it. Simple. But this can hide a lot of complexity – if you have more desks and shorter queues checking in at Heathrow, things go faster. But you don't expect to get past check-in and find several people are assigned to one seat.
This is a trivial example, but move back a bit and airlines have gone bust because their booking systems couldn't cope with the essentially parallel activity of selling seats in an aeroplane at travel agents across the country. Planes flying three quarters full with spare capacity to cover "collisions" for seats – or upgrading overbooked passengers for travel on the next flight - can get expensive.
Do I think that parallelism is intuitive? "Only up to a point, Lord Copper". The consensus among the speakers at the conference was that this would be a revolution in thinking comparable with the OO revolution or structured programming. And (rather like OO) it will probably only become routine once the "old guard" dies off and a new generation of graduates that knows no other way of thinking takes over. ®
Sponsored: Hyper-scale data management