Feeds

Safe signals in Perl

The new gotcha

3 Big data security analytics techniques

Once upon a time, handling signals in Perl code had a pretty big gotcha — one that you couldn’t work around. Perl 5.8 changed signal handling in a way that eliminated that gotcha, but replaced it with a different one, harder to trigger, but no less surprising.

The original gotcha

Signals are delivered asynchronously — by design, you can’t predict when they’re going to arrive. They can’t arrive during the execution of a low-level machine instruction, but that’s pretty much the only guarantee you get.

Now, that might be fine and dandy if you’re writing in assembler, but that’s not true of most of us these days. In other languages, you’re going to run into a problem. Suppose your main code is busily modifying some data structure, and that your signal handler wants to modify the same data structure. Then when the signal is delivered, and your handler is invoked, the data structure could well be in a temporarily-inconsistent state. So if the handler does anything that modifies the data structure, it’s likely to end up corrupting it.

This can be particularly awkward for Perl. The problem is that when your code executes, that involves manipulating data structures inside the runtime (or “interpreter” if you prefer). If the signal handler does the same sort of manipulation — and let me tell you, it almost certainly does — then that’s a good way of corrupting the runtime’s internal data structures in ways that can cause all manner of exciting crashes.

Safe signals

The change in Perl 5.8 that deals with problem is called “safe signals”. Rather than having your signal-handling Perl code be directly invoked during the asynchronous receipt of the signal itself, Perl just has the signal handler note that a particular signal has been delivered. Then at suitable safe moments, it checks whether any signals have been delivered but not yet handled, and if so, invokes the appropriate Perl handler.

This neatly deals with the issue. The runtime never does anything from a signal handler that could corrupt its own data structures. But Perl-side handlers still get invoked asynchronously with respect to the main program.

The new gotcha

Unfortunately, this introduces a problem of its own. It’s all down to what counts as a suitable safe moment to invoke your Perl signal handler. To a first approximation, that means between execution of the individual operations that your code is compiled into. (There’s some additional complexity to allow handlers to be invoked during interrupted I/O system calls, but that’s a relatively minor detail.)

In most circumstances, this works just fine. The problem is when Perl-internal ops take a long time to execute.

Sam Tregar recently posted a message to the Perl-XML mailing-list about a problem he was having with XML::LibXML. When he fed certain badly-broken documents to XML::LibXML’s HTML parser, that triggered an infinite-loop bug in the underlying libxml2 library. He was attempting to work around that bug by using a timeout:

eval {
    local $SIG{ALRM} = sub { die "TIMEOUT\n" };
    alarm 10;
    $libxml->parse_html_string($html);
    alarm 0;
};

First, Sam sets up a signal handler for the duration of this eval block, to raise a distinguishable exception on receipt of a SIGALRM. The first alarm call asks for a SIGALRM signal to be delivered after 10 seconds. Then the HTML is parsed; if that finishes in a reasonable amount of time, the second alarm call disables the timeout. Once it’s all finished, surrounding code checks whether the timeout occurred, and acts appropriately if so.

This looks like perfectly reasonable code. The only problem is that it didn’t work — Sam’s timeout exception was never getting raised.

Knowing about the way Perl safe signals work offers a simple explanation of what’s happening here. XML::LibXML’s parse_html_string method calls an XS function which uses libxml2 to do the parsing. When the Perl runtime invokes an XS function, that’s a single op. So if the XS (or one of the C functions it uses) goes into an infinite loop, that op never finishes executing. But the Perl runtime waits for an op to finish before invoking your signal handler, so the handler never runs.

Ouch.

Workarounds

There are workarounds for this. I suggested to Sam on the list that he just switch back to the pre-5.8 signal-handling behaviour. That’s done by setting the PERL_SIGNALS environment variable to the string unsafe. Unfortunately, you can’t do it from within the program you want to be affected — the variable has to have been set at the point the Perl runtime starts up. A simple option is to put an env(1) wrapper round your code:

$ env PERL_SIGNALS=unsafe perl html_parser.pl

Failing that, in some situations, it might be possible to get your program to wrap itself:

BEGIN {
    if (!$ENV{PERL_SIGNALS} || $ENV{PERL_SIGNALS} ne 'unsafe') {
        $ENV{PERL_SIGNALS} = 'unsafe';
        exec $^X, $0, @ARGV;
    }
}

This isn’t without problems, though — remember that the modern “safe” signal handling is specifically intended to prevent a large class of unpredictable, hard-to-debug memory-corruption bugs that happen only on receipt of signals at certain times.

There isn’t really a perfect solution to Sam’s problem. The only other option is to disable the safe handling just for this specific signal. Instead of using the normal %SIG variable to install a suitable handler, you can ask the POSIX module to install an unsafe SIGALRM handler; Perl’s perlipc documentation has the details. That obviously doesn’t eliminate the problems caused by immediate signal delivery, but it does minimise the situations in which they can bite.

Aaron Crane is El Reg’s Technical Overlord. This piece was originally published on his personal website.

Top three mobile application threats

More from The Register

next story
This time it's 'Personal': new Office 365 sub covers just two devices
Redmond also brings Office into Google's back yard
Inside the Hekaton: SQL Server 2014's database engine deconstructed
Nadella's database sqares the circle of cheap memory vs speed
Oh no, Joe: WinPhone users already griping over 8.1 mega-update
Hang on. Which bit of Developer Preview don't you understand?
Microsoft lobs pre-release Windows Phone 8.1 at devs who dare
App makers can load it before anyone else, but if they do they're stuck with it
Half of Twitter's 'active users' are SILENT STALKERS
Nearly 50% have NEVER tweeted a word
Internet-of-stuff startup dumps NoSQL for ... SQL?
NoSQL taste great at first but lacks proper nutrients, says startup cloud whiz
IRS boss on XP migration: 'Classic fix the airplane while you're flying it attempt'
Plus: Condoleezza Rice at Dropbox 'maybe she can find ... weapons of mass destruction'
Ditch the sync, paddle in the Streem: Upstart offers syncless sharing
Upload, delete and carry on sharing afterwards?
New Facebook phone app allows you to stalk your mates
Nearby Friends feature goes live in a few weeks
prev story

Whitepapers

Top three mobile application threats
Learn about three of the top mobile application security threats facing businesses today and recommendations on how to mitigate the risk.
Combat fraud and increase customer satisfaction
Based on their experience using HP ArcSight Enterprise Security Manager for IT security operations, Finansbank moved to HP ArcSight ESM for fraud management.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Five 3D headsets to be won!
We were so impressed by the Durovis Dive headset we’ve asked the company to give some away to Reg readers.
SANS - Survey on application security programs
In this whitepaper learn about the state of application security programs and practices of 488 surveyed respondents, and discover how mature and effective these programs are.