Wow, what an incredible 12 months: 2017's data center year in review
Predictions of the present past from today's future, or something
OpenPOWER will emerge as the viable alternative to x86
The real battle in server architecture was between Intel’s in-house coalition and what has come to be known as the Rebel Alliance: IBM’s OpenPower industry coalition. Intel brought its all-star team: Xeon Phi, Altera, Omni-Path (plus Nervana/Movidius) and Lustre, while OpenPower countered with a dream team if its own: POWER, Nvidia, Xilinx, and Mellanox (plus TrueNorth) and GPFS (Spectrum Scale).
The all-in-house model promised seamless integration and consistent design, while the extended team offered a best-of-breed approach. Both had/have merits. Both camps are pretty formidable. And there is real differentiation in strategy, design, and implementation. Competition is good.
My car will break down on 101
It has to be mentioned that a couple of times in 2017, my fancy schmancy car broke down on the freeway for no apparent reason. Turned out that my auto entertainment system launched a denial of service attack on the rest of the car, in a bold attempt to gain control. It managed to get the nav system on its side, which made things touch and go for a while. However, a hard reboot at the dealer managed to fix things. But I’m much more wary now...
More chips than Vegas; riskier too
For the first time in decades, there has been a real opening for new chips. What drove this included:
- The emergence of new apps, led by AI and IoT. The part of AI that is computationally interesting and different is high-performance AI, since it intersects with HPC. On the IoT side, backend apps are typically backend analytics to make sense of sensor data. These new apps went where they could run better/faster/cheaper. Even now, they are still too new to be burdened by any allegiance or bonds to a particular chip.
- The fact that many existing apps have no clue what hardware they run on, and operate on the upper layers of a tall stack.
- The possibility to build a complete software stack from open source software components.
- The presence of very large customers like cloud providers or top supercomputing sites. They buy in seriously large volumes and have the wherewithal to build the necessary software stack, so they can afford to pick a new chip and bolster, if not guarantee, its viability.
This was a year when many new chips became available and tested, and there was a pretty long list of them, showing just how big the opportunity is, how eager investors must have been to not miss out, and how many different angles there were.
In addition to AI, there were a few important general-purpose chips being built. The coolest one was by startup Rex Computing, which is working on a chip for exascale, focused on very high compute-power/electric-power ratios. Qualcomm and Caviuum showed off manycore ARM processors, and Intel pushed the X86 envelope nicely.
With AI chips, Intel already had acquired Nervana and Movidius. Google had its TPU, and IBM had its neuromorphic chip, TrueNorth. Other AI chip efforts included Mobileye, Graphcore, BrainChip, TeraDeep, KnuEdge, Wave Computing, and Horizon Robotics. In addition, there were several well-publicized and respected projects like NeuRAM3, P-Neuro, SpiNNaker, Eyeriss, and krtkl going after different parts of the market. That’s a lot of chips, but most of these bets, of course, won’t pay off.
ARM server market share will stay below 3 per cent
Speaking of chips, ARM servers remained important but elusive. They made a lot of noise and pointed to significant wins and systems, but failed to move the needle when it comes to revenue market share in 2017.
As a long-term play, ARM is an important phenomenon in the server market – more so now with the backing of SoftBank, a much larger company apparently intent on investing and building, and various supercomputing projects that are great proving grounds.
But at the end, you need differentiation and ARM has the same problem against X86 as Intel’s Atom had against ARM. Atom did not differentiate vs ARM any more than ARM is differentiating vs Xeon.
Nevertheless, most systems end up being good at something and there are new apps, and an open-source stack to support existing apps, and that helped find a couple of specific workloads where ARM implementations could shine.
Is it an app, or is it a fabric? More cloud fabrics introduced
What is going on with big new apps? They keep getting more modular (microservices), more elastic (scale out), and more real-time (streaming). They’ve become their own large graph, in the computer science sense, and even more so with IoT (sensors everywhere plus in-situ processing).
When you have so many interacting pieces, you’ve got a fabric. But as an app, it’s the kind of graph that evolves and has an overall purpose (semantic web). Among engineering disciplines, software engineering already doesn’t have a great reputation for predictability and maintainability. More modularity is not going to help with that.
But efforts to manage interdependence and application evolution have already created standards for structured modularity like the OSGi Alliance for Java. Smart organizations have had ways to reduce future problems (technical debt) from the get-go. So, it was nice to see that type of methodology get better recognized and better generalized.
So how was your 2017? ®