Merchants remain angry over Protx outage
Root cause analysis
Analysis The fallout from the outage of UK payment processing firm Protx is continuing.
Thousands of online merchants were unable to take payments last Wednesday as a result of a mishandled upgrade that left payment systems at Protx unavailable. Several Reg readers also reported they were unable to reach the firm by either phone or email when trying to resolve the problem.
The delayed upgrade - originally due to be applied in June - involved changes in the processing of Maestro cards, a new system for handling delayed settlement of transactions, and other changes. The most significant (and troublesome, as it turned out) of these changes involved the introduction of 3D Secure, which is similar to an online version of "Chip and PIN". 3D Secure is being introduced by the banks to help reduce fraud for ecommerce transactions. There's some debate over whether Protx applied this system before it strictly had to.
Protx, a Sage subsidiary that's one of the largest payment services in the UK, processes credit and debit card payments for more than 10,000 online and mail order businesses. Problems with the system affected a great number of (mostly small) ecommerce operations.
Root cause analysis
In a letter to merchants sent out on Thursday, Protx chief exec Michael Alculumbre apologised for the inconvenience they experienced as a result of outage of the Protx payment gateway. Soon after the upgrade - the biggest Protx has implemented since November 2005 - was applied at 0600 on Wednesday a bottleneck developed in the system. This lead to a queuing of transactions that eventually overloaded the payment gateway. Protx said the issue had not cropped up during testing.
The root cause of the problem was eventually identified at a low level within Protx's database. "It was established that one table in the database had not been indexed properly and that this was the primary cause for the failure," the letter explains.
"Once we had identified and rectified the problem we were able to restore full service at around 2.30pm... It was my decision as CEO to resolve the system outage on the new platform and not to roll back to the old Protx payment platform."
Although Protx said its service was back online by 1430 BST on Wednesday, a number of users are continuing to experience problems, according to emails from Reg readers and comments on Protx's forums. "I have been getting spurious transaction errors at all times of the day during and after this crisis was supposedly resolved from all kinds of cards," said Reg reader Paul.
Redirects from old URLs also caused problems on Wednesday but it was the timeout issues that caused the greatest headaches. Several readers reported problems with timeout errors - a problem that could lead to customers being billed twice for transactions, as Register reader Gareth explains.
"The main issue, from my perspective was that transactions were timing out. I feel that Protx do not see this as a failure. They are still processing transactions flat out, but their queues just backup and most vendors receive no response whatsoever. I think this is why they say their system was backup at 2:30pm. The problem for many is that after a minute or two they retry the TX [transaction] and the customer gets charged twice," he said.
Sponsored: DevOps and continuous delivery