A 16 Petaflop Cray: The key to fantastic summer barbecues
UK’s Met Office embarks on vital save-our-burgers compute crusade
Successful BBQ at the weekend, or did a cloudburst stop play, leaving your fridge groaning beneath a mountain of uncooked ribs and sausages?
If only the weather forecast had been more reliable.
By 2017 that might be possible, as the Met Office – responsible for generating more than 3,000 tailored forecasts and briefings each day and serves, among others, the BBC – is in the middle of the first phase of rolling out a £100m supercomputer it hopes will drastically improve the research that it’s predictions depend on.
In March, the Met Office installed the first of three Cray XC40s that, combined, are billed as the largest supercomputer dedicated to a single purpose: weather.
The completed Linux super-cluster will expand the Met’s current horsepower by a factor of nine – to 16 Petaflops across 480,000 cores.
The first XC40 is a 1.2 Petaflop machine going into one of two existing computer halls – home to an outdated IBM P775. A second XC40 goes in next spring, bringing online an additional 4.8 Petaflops; but the show-stopper will be a 12,000 node machine that gets its own home; a custom-designed hall at the Met Office’s base in Exeter.
It's so new the Met can't, or won't, give final details on the computer hall's final power or cooling needs, or performance characteristics of its software on the machine.
Why? The Met Office wants its biggest XC40 to run on a generation of chips from supplier Intel that have yet to actually ship: the delayed Skylake, which was kicked out of 2014 into the first quarter of this year and that Intel watchers now hope and expect in August, at the firm's Developer Forum in San Francisco, California.
Cray landed the Met Office deal in September 2014, following a public tender.
Dave Underwood, deputy director of the Met Office high-performance computing (HPC) programme and this super's design and execution chief, told The Reg why he's holding out for a processor that's yet to be delivered.
“Skylake would mean higher processor rates and that means more flops per watt – to an organisation that’s compute-intensive, the more flops we get per watt the better,” Underwood said in an interview.
Underwood described the challenge of planning for a system whose hardware heart has yet to be made publicly available as “interesting”. There's no final word on how the CPUs will work in the real world, either on Met's software, or absolute requirements in terms of power and cooling.
Weather HQ: The Met Office in Exeter – more than 3,000 forecasts and predictions each day
“We have ballpark estimates, but until we see what the manufacturer has done, we don’t know what the results will be,” Underwood said.
But why exactly is Underwood holding out for those petaflops?
The Met’s super-cluster will be dedicated to crunching weather models built using millions of lines of Fortran code, written by the Met Office's combined teams of scientists and engineers.
And not just running one model at a time, like now, but running lots of models simultaneously and in a shorter time – 40 minutes per model compared with today’s one model per hour.
The reason is that there's a growing demand for the type of weather prediction based on ensemble modeling that's already in use. It is replacing probabilistic prediction, which goes back to the 1950s.
Ensemble modeling builds forecasts that deliver a range of probable outcomes using a variety of different data and visual inputs. The bedrock is Monte Carlo simulations: computational algorithms that produce results that can be explained in terms of certainty – the probability that a given will take place.
Not just that, but the plan is to offer more long-range predictions, further out than five days ahead, while getting better at predicting subtler and more frustrating forms of weather element, like fog.