Feeds

'We don't use UPS. If we did we'd have huge UPSs and tiny computers'

When the heat is on, the last thing a supercomputer needs is a big battery

Top 5 reasons to deploy VMware with Tegile

The heatwave-driven outage at the VLSCI supercomputing facility last week could have been worse than it was, with power cuts also a risk, the facility has confirmed.

A senior systems administrator at VLSCI, Chris Samuel, has discussed the outage and the lessons learned with The Register.

While the reason for the shutdown was heat, Samuel said there were also concerns that the heatwave might lead to a power cut. Melburnians were warned last week that as the heatwave dragged on (and air-conditioners laboured to cope) that there might be cuts.

There were some cuts, but they didn't affect the VLSCI, which is a good thing, because there isn't a backup. As he told us, power cuts are always a concern: “we don't use UPS for the computer systems – we would end up with huge UPSs and tiny computer systems.”

“That said, we've always been very lucky with power around this area … it might be because of our proximity to [Melbourne] hospitals.”

As we wrote yesterday, the incoming water temperature ended up exceeding the specification for the facility. The cooling is a closed system (thanks also to the commenter who also noticed this).

The VLSCI setup, Samuel explained has one coolant loop from the roof into a buffer tank. From there, the water is fed to CDUs – coolant distribution units – where they dump the heat from the machines. Inside the machine rooms, there are three closed loops: one each for two Blue Gene/Q racks, and a third for the water cooled rear rack doors for the other machines.

The water is then circulated to the chillers on the roof, “and the cycle begins again”, he said. In the extreme heat, the roof temperatures meant that the chillers were delivering water that Samuel explained “was getting close to the threshold for the racks, and was still climbing.”

Avoca was the most affected system, simply because it's so much more powerful than the Merri or Barcoo machines: “Even though it's far more power efficient than the Intel systems, its combined heat generating capability is huge – it dumps far more heat into the water than both the Intel systems combined.” ®

Top 5 reasons to deploy VMware with Tegile

More from The Register

next story
Azure TITSUP caused by INFINITE LOOP
Fat fingered geo-block kept Aussies in the dark
NASA launches new climate model at SC14
75 days of supercomputing later ...
Yahoo! blames! MONSTER! email! OUTAGE! on! CUT! CABLE! bungle!
Weekend woe for BT as telco struggles to restore service
You think the CLOUD's insecure? It's BETTER than UK.GOV's DATA CENTRES
We don't even know where some of them ARE – Maude
Cloud unicorns are extinct so DiData cloud mess was YOUR fault
Applications need to be built to handle TITSUP incidents
BOFH: WHERE did this 'fax-enabled' printer UPGRADE come from?
Don't worry about that cable, it's part of the config
Stop the IoT revolution! We need to figure out packet sizes first
Researchers test 802.15.4 and find we know nuh-think! about large scale sensor network ops
DEATH by COMMENTS: WordPress XSS vuln is BIGGEST for YEARS
Trio of XSS turns attackers into admins
prev story

Whitepapers

Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
How to determine if cloud backup is right for your servers
Two key factors, technical feasibility and TCO economics, that backup and IT operations managers should consider when assessing cloud backup.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Security and trust: The backbone of doing business over the internet
Explores the current state of website security and the contributions Symantec is making to help organizations protect critical data and build trust with customers.