Feeds

China's Nebulae supercomputer - zero to second in 3 months

The perils of big-system assembly

SANS - Survey on application security programs

HPC Blog Those of you interested in what it really takes to bring up a massive system don't want to miss the "Lessons Learned Deploying the World's First GPU-Based Petaflop System" session.

In it, NVIDIA's senior hardware architect, Dale Southard, discusses his experience with China's Nebulae supercomputer - which, in addition to being the #2 system on the TOP500, was also probably the quickest big build of all time, at around three months total.

Southard, who describes himself as a professional debugger, has quite a track record with big systems. Before NVIDIA, he was at Lawrence Livermore, where he participated in a number of large builds (and probably has the scars to prove it). While the Nebulae super went from bare floor to petabyte processing in record time (about 90 days), it had its share of birthing pains.

In his session, Southard spent some time explaining the differences between small, medium and massive systems. According to him, the 'interesting times' begin in earnest when you move from thousand-node systems to something bigger.

The thousand-node system isn't trivial; it requires considerable - often custom - tooling for management and configuration tasks. But the bigger systems are a whole new world of complexity, mainly due to the fact that things that only rarely (if ever) go wrong on smaller systems malfunction frequently when you amass such a huge array of gear.

Southard related a story about a capacitor that blew up like a mini hand grenade, damaging other components as bits of it wormed their way into nooks and crannies. That's not something you see every day.

He also shared a laundry list of things to check out proactively before you begin big-system assembly. For example: make sure that all the systems have a common BIOS level and that they have the correct processors running at the right speed. When you're talking about such a large number of systems, even stringent quality control can let one or two inconsistent builds slip through. Catching these problems early will save countless hours of troubleshooting down the road.

The videos of the sessions aren't up yet, but I'm told that they should be posted by the end of the week. I'll put in links as soon as I get them... ®

3 Big data security analytics techniques

More from The Register

next story
This time it's 'Personal': new Office 365 sub covers just two devices
Redmond also brings Office into Google's back yard
Kingston DataTraveler MicroDuo: Turn your phone into a 72GB beast
USB-usiness in the front, micro-USB party in the back
IBM rides nightmarish hardware landscape on OpenPOWER Consortium raft
Google mulls 'third-generation of warehouse-scale computing' on Big Blue's open chips
It's GOOD to get RAIN on your upgrade parade: Crucial M550 1TB SSD
Performance tweaks and power savings – what's not to like?
AMD's 'Seattle' 64-bit ARM server chips now sampling, set to launch in late 2014
But they won't appear in SeaMicro Fabric Compute Systems anytime soon
prev story

Whitepapers

Securing web applications made simple and scalable
In this whitepaper learn how automated security testing can provide a simple and scalable way to protect your web applications.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Mainstay ROI - Does application security pay?
In this whitepaper learn how you and your enterprise might benefit from better software security.
Combat fraud and increase customer satisfaction
Based on their experience using HP ArcSight Enterprise Security Manager for IT security operations, Finansbank moved to HP ArcSight ESM for fraud management.