The Register® — Biting the hand that feeds IT

Feeds

AMD Opteron CPUs hit by heat stroke

'Marginal' effect, chip maker claims

Agentless Backup is Not a Myth

Exclusive AMD today admitted it has inadvertently allowed a number of 2.6GHz and 2.8GHz single-core Opteron x52 and x54 processors that could corrupt data under extreme conditions to escape into the wild.

It is believed that the glitch is triggered when the affected chip's FPU is made to loop through a series of memory-fetch, multiplication and addition operations without any condition checks on the result of the calculations. The loop has to run over and over again for long enough to cause localised heating which together with high ambient temperatures could combine to cause the result of the operation to be recorded incorrectly, leading to data corruption.

To trigger the effect, the loop has to be run millions of time, an AMD customer source told Reg Hardware, potentially for hours at a time with no other operations being introduced during the run.

According to the source - who claimed to be party to emails highlighting the issue and sent by AMD to a number of the chip maker's major customers and partners - AMD has investigated the problem and found it was only able to reproduce the bug's effects in a synthetic benchmark test.

The problem is believed to affect only a fraction - perhaps no more than 3,000 individual CPUs - which managed to slip through AMD's screening net. It is not known how this so-called 'test escape' ocurred, but it took place "in part of 2005 and early 2006", an AMD spokesman said.

AMD said it has introduced another screening test to catch any further affected parts. Chips caught in this test in future will be re-rated at a lower clock speed to prevent the problem. The company is also working with OEMs to identify affected parts and contact customers who could be affected - if they are, they will be offered free replacements.

AMD stressed the problem was due to "a convergence of three specific simultaneous conditions", not a fault with the Opteron architecture. The company claimed the issue had not been observed on systems running commercially available applications.

"It's very hard to imagine this type of [tight FP loop] code in our [financial services] environment," Reg Hardware's source said. "The only thing I could think that would be coded this way would be some type of strange cipher code. For example, any type of 'for' loop that uses a compare operation would not have the problem." ®

Customer Success Testimonial: Recovery is Everything

More from The Register

 breaking news
Curtain drops on Apple Store ahead of WWDC: What lies behind?
Steve Jobs watching from on high. No pressure, lads
 breaking news
Cold, dead hands of Steve Jobs slip from iPhones: The Cult of Ive is upon us
Billionaire biz baron's death clears way for uber-shiny iOS 7
Airbus imagines suitcases that find themselves
Point your mobe at your smalls to track their every move
Surprise! Intel smartphone trounces ARM in power trials
Tests show equal performance while sipping significantly less juice
First look: iOS 7 for iPad
No, Apple hasn't released it yet, but that doesn't stop intrepid devs
Apple said to be 'exploring' 5.7-inch iPhone
Who's the copycat this time, Mr. Cook?
Review: Belkin Thunderbolt Express Dock
Missing Mac ports reunited, for a price
 breaking news
Australian 'Apple tax' repealed for MacBook Air
But the new MacPro is priced at a premium