Future supers pop up on $636m cash wishlist to get exascale beasts prowling on US soil
Oak Ridge, Lawrence Livermore labs slated for beastmode kit in 2021-2023
Posted in HPC, 5th March 2018 19:59 GMT
The two new mystery exascale computing systems known only as Frontier and El Capitan popped up on a budget request last week. They are being developed by the US government and have been slated for deployment in 2022 and 2023.
If American Congresspeople were to say yes to the budget, the DoE would get $636m towards current R&D work on exascale computing, including a $23m cushion to prep the Lawrence Livermore National lab for the coming of El Capitan in five years' time.
However, as our sister publication The Next Platform has noted, scientists shouldn't start mentally spending the cash:
The proposal is not law or policy, and over the past decade, Congress has tended to essentially ignore the budget proposals from presidents and create its own spending plans. And in a highly divided and increasingly partisan Congress, even doing that has been difficult, with government funding often being done through weeks- or months-long emergency continuing resolutions that tend to keep budgets at previous levels.
The administration's FY2019 budget request last week included $636m in funding for the Department of Energy's Exascale Computing Project, $376m up on FY2017 enacted levels.
There are three exascale systems mentioned:
- Aurora – Intel/Cray-based to be delivered in 2021 at Argonne National Laboratory (ANL) and already known about
- Frontier – for 2021-2022 delivery to Oak Ridge National Laboratory (ORNL)
- El Capitan – to be delivered to the Lawrence Livermore National Laboratory (LLNL) around 2023, with funding under the National Nuclear Security Administration's (NNSA) Advanced Simulation & Computing Capital
Some Exascale Computing Project (ECP) presentations shed further light on the matter.
The 180PFlop Aurora was supposed to go live this year but has been delayed since Intel stopped developing the Knights Hill gen 3 Phi processors upon which it depended. The US govt pushed back the deadline to 2021, by which time Chipzilla must rework its processors to bring Aurora up to 1,000 PFLOPs.
Frontier and El Capitan are mentioned in an ECP Update slide deck (PDF):
We understand El Capitan is in an initial development phase. It and Frontier have no defined architecture or suppliers yet. We might suppose that, since Intel and Cray are working on Aurora, that combinations of the other four hardware suppliers to ECP – AMD, HPE, IBM and Nvidia – might be involved.
The DoE has two departments with significant supercomputing spending – the Office of Science and the NNSA.
The budget request splits a $578m pool between further research funding for exascale and quantum computing - the latter scoops $105m. It is earmarked to “address the emerging urgency of building U.S. competency and competitiveness in the developing area of quantum information science, including quantum computing and quantum sensor technology.”
The DoE will also develop a software stack for both exascale platforms, and to support additional co-design centres in preparation for exascale deployment in 2021.
According to a DoE presentation (PDF) an exascale system – a computer that can hit at least one exa-FLOPS, or a billion billion floating-point math calculations per second – will deliver 50x the performance of today's 20 petaFLOPS systems, operate in a 20-30MW power envelope, and have a perceived fault rate of one a week or less.
There is no LINPACK or peak FLOPS target. Instead Figures of Merit (FOMs) will be defined:
Intel has staggered the ECP's schedule with its Knight Landing gen 3 cancellation. There is now no clear understanding of the hardware elements and architecture for the Aurora, Frontier and El Capitan exascale triplets. There are just three years until Aurora sets down at Argonne. Care to bet it will be on time? ®