HPE supercomputer is still crunching numbers in space after 340 days
No rad hardening so even HPE is 'pleasantly surprised'
HPE’s mini supercomputer launched into space last year has survived the harsh conditions of zero gravity and radiation for almost a year.
The Spaceborne Computer isn’t the greatest supercomputer and has a performance of one teraflop, runs on Red Hat Enterprise Linux and is built out of two HPE Apollo Intel x86 servers with a 56Gbps interconnect.
NASA wanted to see if a computer would last for a year - roughly the time it takes to reach Mars - inside the International Space Station (ISS). So, HPE offered to tuck its Spaceborne Computer aboard SpaceX’s CRS-12 rocket and send it into the abyss.
“It has now been in space for 340 days”, said Mark Fernandez, America’s HPC technology officer at HPE and co-principal of the experiment, during a panel talk at the ISS Research & Development Conference on Wednesday in San Francisco.
The computer doesn’t help the astronauts with any daily tasks or run any fancy programs - you definitely cannot surf the internet. Instead, it runs a series of benchmarking tests to push its interconnect, storage, CPU and memory components to the limit to see if the computer breaks in space. The results are then compared with a computer on Earth to see if there are any differences.
The machine hasn’t been radiation hardened, and relies on a few software tricks to stop it from corrupting, something Fernandez calls “autonomous self-care”. Continuous health checks helps keep the computer in check, and when it detects any potential hardware failures, it runs at a slower pace or enters "idle mode", where it powers down.
“Running fast is better than running slow. Running slow is better than it being powered off. But being powered off is better than being damaged,” Fernandez told The Register. There have only been two instances where the supercomputer went down, and both were random accidents.
Place your bets: How long will 1TFLOPS HPE box last in space without proper rad hardeningREAD MORE
One happened when the smoke alarm aboard the ISS was falsely triggered which in turn switched off the power supply for emergency purposes. “All sorts of things float around in the space station, dust, bits of toothpaste, crumbs, you name it, and maybe one of the particles got lodged in the smoke detector. We don’t know what caused it,” he said.
The other one was down to an astronaut who accidentally turned off the power switch to a rack that contained HPE’s computer when unloading supplies from SpaceX’s Dragon capsule.
“The most common failures are non-permanent, computational ones like to its power, memory, CPU cache. These happen more frequently in space than on Earth,” Fernandez said. “The interconnect is fine, but SSDs fail at an alarming rate in space.”
HPE is considering radiation hardening SSDs if it builds another computer for a future space mission. “It’s a pleasant surprise that it’s still working since even we didn’t think it would last this long. At least we know it would probably make it all the way to Mars, but we're not sure if it'd make it back,” he said. ®