Feeds

Sun suffers UltraSparc II cache crash headache

So do users

  • alert
  • submit to reddit

The Essential Guide to IT Transformation

Sun Microsystems is advising support staff not to let on to clients that problems they have with its kit might be due to a wider year-old technical problem.

The surprising advice from the hardware giant covers problems involving a processor fault that can cause certain Sun servers, particularly those with 400MHz UltraSparc IIs, to crash without warning. Servers with 450MHz UltraSparc II processors are also affected, but to a lesser degree.

Our sources within the hardware giant tell us that staff are working under "orders" not to tell customers that any failures they experience could be part of a wider problem, involving cache memory on its UltraSparc II processor. The fault results in random parity errors which can force a server to shut down.

"Apparently the design [Sun's] is fine, but the execution [which was outsourced] leaves a little to be desired. Result, system crashes [or in Sun lingo system panic]. In the best case, system panics re-starts and you never see the problem again. Worst case boot-loop," our informant tells us.

"It has gotten to the point that just about the first thing we ask [users] is 'what speed processor do you have', and one system panic isn't enough for us to do something about it."

The problem came to light over a year ago and was widespread enough for respected analyst firm Gartner to advice users to try to stay clear of 400MHz, 4MB cache UltraSparc II microprocessor modules, which are the focus of concerns. Instead it advised users to pick 400MHz, 8MB cache UltraSparc II microprocessor modules.

At the time Sun admitted there had been quality issues with Static RAM (SRAM) on some 400MHz CPUs, and quality control problems with the fibre-optic controllers. Sun said the problem was due to components supplied by a third-party, and that it had changed its supplier.

Sun's line since then has been that few of its customers were affected by the issue and in any case the problem has now been solved.

However Sun published a best practice guide on "Addressing: E-Cache Parity Errors" in October 2000, which has been leaked to The Register, that suggests the problem is not as far in the past as it would like to say.

This states: "Some customers have experienced intermittent external cache parity errors which can be caused by a faulty component (SRAM) that is overly susceptible to a number of factors. These factors can include temperature, humidity, slot, process running, noise and ionizing radiation that occurs naturally in the environment."

Throughout last year Gartner reported that 60 clients have experienced problems with the bug on many of their Solaris servers. It reported that UE10000 with more than 36 CPUs and the UE6500 with more than 20 CPUs seemed to be particularly susceptible to the problem.

The UltraSparc III processor features a mirrored cache and is immune to the problem, although high-end servers using the chip are not expected to ship until the second half of 2001, at the earliest. ®

Related stories

Lights go out on UltraSPARC III supply
Sun debuts UltraSPARC III and embraces copper

The Essential Guide to IT Transformation

More from The Register

next story
BBC goes offline in MASSIVE COCKUP: Stephen Fry partly muzzled
Auntie tight-lipped as major outage rolls on
iPad? More like iFAD: We reveal why Apple fell into IBM's arms
But never fear fanbois, you're still lapping up iPhones, Macs
Sonos AXES support for Apple's iOS4 and 5
Want to use your iThing? You can't - it's too old
Stick a 4K in them: Super high-res TVs are DONE
4,000 pixels is niche now... Don't say we didn't warn you
Philip K Dick 'Nazi alternate reality' story to be made into TV series
Amazon Studios, Ridley Scott firm to produce The Man in the High Castle
There's NOTHING on TV in Europe – American video DOMINATES
Even France's mega subsidies don't stop US content onslaught
You! Pirate! Stop pirating, or we shall admonish you politely. Repeatedly, if necessary
And we shall go about telling people you smell. No, not really
Too many IT conferences to cover? MICROSOFT to the RESCUE!
Yet more word of cuts emerges from Redmond
Joe Average isn't worth $10 a year to Mark Zuckerberg
The Social Network deflates the PC resurgence with mobile-only usage prediction
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
The Essential Guide to IT Transformation
ServiceNow discusses three IT transformations that can help CIO's automate IT services to transform IT and the enterprise.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
How modern custom applications can spur business growth
Learn how to create, deploy and manage custom applications without consuming or expanding the need for scarce, expensive IT resources.
Build a business case: developing custom apps
Learn how to maximize the value of custom applications by accelerating and simplifying their development.