Feeds

'We don't use UPS. If we did we'd have huge UPSs and tiny computers'

When the heat is on, the last thing a supercomputer needs is a big battery

Build a business case: developing custom apps

The heatwave-driven outage at the VLSCI supercomputing facility last week could have been worse than it was, with power cuts also a risk, the facility has confirmed.

A senior systems administrator at VLSCI, Chris Samuel, has discussed the outage and the lessons learned with The Register.

While the reason for the shutdown was heat, Samuel said there were also concerns that the heatwave might lead to a power cut. Melburnians were warned last week that as the heatwave dragged on (and air-conditioners laboured to cope) that there might be cuts.

There were some cuts, but they didn't affect the VLSCI, which is a good thing, because there isn't a backup. As he told us, power cuts are always a concern: “we don't use UPS for the computer systems – we would end up with huge UPSs and tiny computer systems.”

“That said, we've always been very lucky with power around this area … it might be because of our proximity to [Melbourne] hospitals.”

As we wrote yesterday, the incoming water temperature ended up exceeding the specification for the facility. The cooling is a closed system (thanks also to the commenter who also noticed this).

The VLSCI setup, Samuel explained has one coolant loop from the roof into a buffer tank. From there, the water is fed to CDUs – coolant distribution units – where they dump the heat from the machines. Inside the machine rooms, there are three closed loops: one each for two Blue Gene/Q racks, and a third for the water cooled rear rack doors for the other machines.

The water is then circulated to the chillers on the roof, “and the cycle begins again”, he said. In the extreme heat, the roof temperatures meant that the chillers were delivering water that Samuel explained “was getting close to the threshold for the racks, and was still climbing.”

Avoca was the most affected system, simply because it's so much more powerful than the Merri or Barcoo machines: “Even though it's far more power efficient than the Intel systems, its combined heat generating capability is huge – it dumps far more heat into the water than both the Intel systems combined.” ®

Boost IT visibility and business value

More from The Register

next story
The Return of BSOD: Does ANYONE trust Microsoft patches?
Sysadmins, you're either fighting fires or seen as incompetents now
Microsoft: Azure isn't ready for biz-critical apps … yet
Microsoft will move its own IT to the cloud to avoid $200m server bill
Shoot-em-up: Sony Online Entertainment hit by 'large scale DDoS attack'
Games disrupted as firm struggles to control network
Cutting cancer rates: Data, models and a happy ending?
How surgery might be making cancer prognoses worse
Silicon Valley jolted by magnitude 6.1 quake – its biggest in 25 years
Did the earth move for you at VMworld – oh, OK. It just did. A lot
VMware's high-wire balancing act: EVO might drag us ALL down
Get it right, EMC, or there'll be STORAGE CIVIL WAR. Mark my words
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Endpoint data privacy in the cloud is easier than you think
Innovations in encryption and storage resolve issues of data privacy and key requirements for companies to look for in a solution.
Scale data protection with your virtual environment
To scale at the rate of virtualization growth, data protection solutions need to adopt new capabilities and simplify current features.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?