Feeds

Safe signals in Perl

The new gotcha

Boost IT visibility and business value

Once upon a time, handling signals in Perl code had a pretty big gotcha — one that you couldn’t work around. Perl 5.8 changed signal handling in a way that eliminated that gotcha, but replaced it with a different one, harder to trigger, but no less surprising.

The original gotcha

Signals are delivered asynchronously — by design, you can’t predict when they’re going to arrive. They can’t arrive during the execution of a low-level machine instruction, but that’s pretty much the only guarantee you get.

Now, that might be fine and dandy if you’re writing in assembler, but that’s not true of most of us these days. In other languages, you’re going to run into a problem. Suppose your main code is busily modifying some data structure, and that your signal handler wants to modify the same data structure. Then when the signal is delivered, and your handler is invoked, the data structure could well be in a temporarily-inconsistent state. So if the handler does anything that modifies the data structure, it’s likely to end up corrupting it.

This can be particularly awkward for Perl. The problem is that when your code executes, that involves manipulating data structures inside the runtime (or “interpreter” if you prefer). If the signal handler does the same sort of manipulation — and let me tell you, it almost certainly does — then that’s a good way of corrupting the runtime’s internal data structures in ways that can cause all manner of exciting crashes.

Safe signals

The change in Perl 5.8 that deals with problem is called “safe signals”. Rather than having your signal-handling Perl code be directly invoked during the asynchronous receipt of the signal itself, Perl just has the signal handler note that a particular signal has been delivered. Then at suitable safe moments, it checks whether any signals have been delivered but not yet handled, and if so, invokes the appropriate Perl handler.

This neatly deals with the issue. The runtime never does anything from a signal handler that could corrupt its own data structures. But Perl-side handlers still get invoked asynchronously with respect to the main program.

The new gotcha

Unfortunately, this introduces a problem of its own. It’s all down to what counts as a suitable safe moment to invoke your Perl signal handler. To a first approximation, that means between execution of the individual operations that your code is compiled into. (There’s some additional complexity to allow handlers to be invoked during interrupted I/O system calls, but that’s a relatively minor detail.)

In most circumstances, this works just fine. The problem is when Perl-internal ops take a long time to execute.

Sam Tregar recently posted a message to the Perl-XML mailing-list about a problem he was having with XML::LibXML. When he fed certain badly-broken documents to XML::LibXML’s HTML parser, that triggered an infinite-loop bug in the underlying libxml2 library. He was attempting to work around that bug by using a timeout:

eval {
    local $SIG{ALRM} = sub { die "TIMEOUT\n" };
    alarm 10;
    $libxml->parse_html_string($html);
    alarm 0;
};

First, Sam sets up a signal handler for the duration of this eval block, to raise a distinguishable exception on receipt of a SIGALRM. The first alarm call asks for a SIGALRM signal to be delivered after 10 seconds. Then the HTML is parsed; if that finishes in a reasonable amount of time, the second alarm call disables the timeout. Once it’s all finished, surrounding code checks whether the timeout occurred, and acts appropriately if so.

This looks like perfectly reasonable code. The only problem is that it didn’t work — Sam’s timeout exception was never getting raised.

Knowing about the way Perl safe signals work offers a simple explanation of what’s happening here. XML::LibXML’s parse_html_string method calls an XS function which uses libxml2 to do the parsing. When the Perl runtime invokes an XS function, that’s a single op. So if the XS (or one of the C functions it uses) goes into an infinite loop, that op never finishes executing. But the Perl runtime waits for an op to finish before invoking your signal handler, so the handler never runs.

Ouch.

Workarounds

There are workarounds for this. I suggested to Sam on the list that he just switch back to the pre-5.8 signal-handling behaviour. That’s done by setting the PERL_SIGNALS environment variable to the string unsafe. Unfortunately, you can’t do it from within the program you want to be affected — the variable has to have been set at the point the Perl runtime starts up. A simple option is to put an env(1) wrapper round your code:

$ env PERL_SIGNALS=unsafe perl html_parser.pl

Failing that, in some situations, it might be possible to get your program to wrap itself:

BEGIN {
    if (!$ENV{PERL_SIGNALS} || $ENV{PERL_SIGNALS} ne 'unsafe') {
        $ENV{PERL_SIGNALS} = 'unsafe';
        exec $^X, $0, @ARGV;
    }
}

This isn’t without problems, though — remember that the modern “safe” signal handling is specifically intended to prevent a large class of unpredictable, hard-to-debug memory-corruption bugs that happen only on receipt of signals at certain times.

There isn’t really a perfect solution to Sam’s problem. The only other option is to disable the safe handling just for this specific signal. Instead of using the normal %SIG variable to install a suitable handler, you can ask the POSIX module to install an unsafe SIGALRM handler; Perl’s perlipc documentation has the details. That obviously doesn’t eliminate the problems caused by immediate signal delivery, but it does minimise the situations in which they can bite.

Aaron Crane is El Reg’s Technical Overlord. This piece was originally published on his personal website.

The Essential Guide to IT Transformation

More from The Register

next story
NO MORE ALL CAPS and other pleasures of Visual Studio 14
Unpicking a packed preview that breaks down ASP.NET
KDE releases ice-cream coloured Plasma 5 just in time for summer
Melty but refreshing - popular rival to Mint's Cinnamon's still a work in progress
Leaked Windows Phone 8.1 Update specs tease details of Nokia's next mobes
New screen sizes, dual SIMs, voice over LTE, and more
Another day, another Firefox: Version 31 is upon us ALREADY
Web devs, Mozilla really wants you to like this one
Put down that Oracle database patch: It could cost $23,000 per CPU
On-by-default INMEMORY tech a boon for developers ... as long as they can afford it
Mozilla keeps its Beard, hopes anti-gay marriage troubles are now over
Plenty on new CEO's todo list – starting with Firefox's slipping grasp
Apple: We'll unleash OS X Yosemite beta on the MASSES on 24 July
Starting today, regular fanbois will be guinea pigs, it tells Reg
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
The Essential Guide to IT Transformation
ServiceNow discusses three IT transformations that can help CIO's automate IT services to transform IT and the enterprise.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
How modern custom applications can spur business growth
Learn how to create, deploy and manage custom applications without consuming or expanding the need for scarce, expensive IT resources.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.