Software disaster zone Knight Capital bags $400m lifeline
What really happened when computers at trading firm went nuts
I recently wrote about how a bad round of software testing lost Wall Street trading firm Knight Capital an estimated $440m – enough to almost put the company out of business.
I speculated that Knight could be bailed out if it's allowed to unwind its computer system's unexpected burst of loss-making trades on the stock market - effectively taking a mulligan* on the 45-minute debacle. Turns out that ain’t gonna happen.
Knight was left squirming on the hook by US regulators and subsequently forced to find money to cover the losses from third parties. In return for floating Knight $400m, six other Wall Street firms will be paid a 2 per cent preferred stock dividend and, if they like, be able to convert those preferred shares into enough common stock to own 75 per cent of the company – a pretty sweet deal for a company that was a solid market player before last Wednesday.
Tyler Durden’s blog posts at Zerohedge have been both fast and solid on this story. Here he posts some of the highlights from an interview with Knight CEO Tom Joyce. One of the quotes from Joyce: “We have to do a better job on our testing environment.”
Yeah, I think I’d make that a priority, or at least move it farther up the list. It’s an understatement of such magnitude that I’m at a loss to come up with an apt comparison. Maybe if Napoleon had said, “We need to do a better job of scouting out our enemies” after Waterloo? But I’m drawing a blank right now.
So what happened IT-wise with Knight Capital? Zerohedge links to a Nanex Research blog that looks to have a pretty good handle on the gory details. They believe that Knight was testing to make sure that a new market maker software package (Retail Liquidity Provider – RLP) would integrate with the NYSE live trading system.
(Being a ‘market maker’ simply means that your firm holds a position in a particular stock, and that you’re usually a willing buyer or seller. We can talk about ‘specialists’ vs market makers, but that’s not important in this context.)
In addition to the RLP code, there’s a testing routine that fires off buy and sell orders at RLP in order to ensure that it properly records all of the trades. It’s like a load generator for a commercial application, and it’s used in an isolated lab to simulate live trading.
It looks like that package was mistakenly included in the RLP deployment package, and the whole thing was fired up on Wednesday morning and linked to the NYSE live system. As Nanex eloquently described it:
“...On the morning of August 1st, the Tester is ready to do its job: test market making software. Except this time it's no longer in the lab, it's running on NYSE's live system. And it's about to test any market making software running, not just Knight's. With real orders and real dollars. And it won't tell anyone about it, because that's not its function.”
And the tester continued testing and testing and testing. It bought and sold willy-nilly and kept quiet about it. The Knight IT and trading staff probably didn’t even know that the testing program was running. And since it’s just a testing program, it didn’t keep track of any of its activity – meaning it wasn’t easy for Knight to immediately understand the magnitude of what had happened and the massive losses they’d incurred.
Take a look at the Nanex account of this: it’s short, easily understandable, but damned chilling. While you’re reading it, keep in mind that Knight Capital isn’t a high frequency trading firm or a hedge fund. Their bread and butter is aggregating and executing securities orders for retail brokerages like TD Ameritrade and Scottrade. Knight Capital wasn’t operating at the cutting edge of financial engineering or gimmickry; they’re middlemen who get a very tiny slice of each transaction because they can trade faster and less expensively than their customers can on their own.
Knight Capital wasn’t undone by taking on wild market risk or by being reckless. They were taken down by something entirely different: bad IT practices. There probably isn’t any single party that’s to blame. It’s most likely a combination of bad documentation and poor attention to detail from both Knight and third parties (ISV, integrator, whoever). These factors, and probably some others, combined to make for a bad, bad, morning. ®
* A free shot given to a golfer in an informal game when his or her previous shot was poorly played. What do you mean you don't play golf?
Sponsored: DevOps and continuous delivery