Feeds

Weather gets granular with GPUs

Just say NOAA

High performance access to file storage

HPC Blog Everyone complains about the weather, but no one is doing anything about it.

The folks at the National Oceanic and Atmospheric Administration (NOAA) aren't doing anything about the weather. They're too busy trying to figure out what it's going to do tomorrow and next week.

I sat in on a very interesting presentation from NOAA on Tuesday afternoon at the GPU Technology Conference about how they're going to use GPUs to sharpen up their forecasts and dial them in to a much greater resolution. This is quite a computational problem, as it turns out. In 2008, it took 800 cores to drive their model at a 15 to 30 kilometer resolution. To get to a 10 KM resolution, it took a bit more hardware –125,200 more processor cores, to be exact – for a grand total of 126,000 cores.

Their next step is to get to 3.5 KM resolution, which is an entirely different kettle of fish. The only way to get to this level of granularity is to move to GPUs in a big way - which is what they're pursuing right now. They've learned some lessons along the way, the foremost being that the key to efficiently taking advantage of GPUs is to intimately know their code.

For example, when they were running their models with CPUs exclusively, interprocess communications used up about 5 per cent of the cycles. The move to GPUs didn't change the need or the time necessary for these communications, but because of the greater speed of the GPUs, the ratio of communication time to processing became 50% of total processing - making these processes enemy number one.

Memory management is also hugely important. GPUs are incredibly fast on the right code, but not understanding how to best utilize the memory on the GPU card can keep you from getting the most out of them. There are two classes of memory on the cards: the 16k that is closest to each GPU core, and then the much larger (1GB in the NOAA situation) global shared memory on the card.

The difference in speed in accessing this memory is profound - it takes only two cycles to get to the close memory, and 100 cycles to get to the global memory. Accessing memory on the server host would, assumedly, be measured in geologic time. Wise use of the blazingly fast, but tiny, memory attached to each core can make the difference between going faster and going a whole hell of a lot faster.

Likewise, constantly fetching data from the CPU-based host server is also costly from a performance standpoint. One weather model, called WRF (pronounced "Worf," like the Star Trek guy) showed a 20x speed-up in raw performance that shrunk to 7x when taking into account the time needed to copy data from the server over and over again. The NOAA folks have restructured their programs to minimize data copying and have seen performance rise commensurately.

Right now they're seeing performance ranging from 15x to 39x speed-up with GPU + CPU systems vs. exclusively CPU-based hardware. This is with fully optimized CUDA code running on a smallish pilot system, but it has proven the validity of their approach and is a pretty big win. Their push going forward is to scale the model to larger hardware - fueled by GPUs - completing the transition to the 3.5 KM resolution. ®

High performance access to file storage

More from The Register

next story
Seagate brings out 6TB HDD, did not need NO STEENKIN' SHINGLES
Or helium filling either, according to reports
European Court of Justice rips up Data Retention Directive
Rules 'interfering' measure to be 'invalid'
Dropbox defends fantastically badly timed Condoleezza Rice appointment
'Nothing is going to change with Dr. Rice's appointment,' file sharer promises
Cisco reps flog Whiptail's Invicta arrays against EMC and Pure
Storage reseller report reveals who's selling what
Bored with trading oil and gold? Why not flog some CLOUD servers?
Chicago Mercantile Exchange plans cloud spot exchange
Just what could be inside Dropbox's new 'Home For Life'?
Biz apps, messaging, photos, email, more storage – sorry, did you think there would be cake?
IT bods: How long does it take YOU to train up on new tech?
I'll leave my arrays to do the hard work, if you don't mind
prev story

Whitepapers

Mainstay ROI - Does application security pay?
In this whitepaper learn how you and your enterprise might benefit from better software security.
Five 3D headsets to be won!
We were so impressed by the Durovis Dive headset we’ve asked the company to give some away to Reg readers.
3 Big data security analytics techniques
Applying these Big Data security analytics techniques can help you make your business safer by detecting attacks early, before significant damage is done.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Mobile application security study
Download this report to see the alarming realities regarding the sheer number of applications vulnerable to attack, as well as the most common and easily addressable vulnerability errors.