Xeon-bashing Tachyum claims its Prodigy CPU will run AI jobs as well as traditional apps
No need for GPU/TPU acceleration? We'll see
Tachyum is developing a processor that it alleges will run everyday applications as well as AI code that would normally require a GPU-like hardware accelerator.
Anandtech live-blogged the biz's Prodigy CPU presentation at this week's Hot Chips conference in California, where a lot of promises were made. We don't yet know if they'll be kept.
This homegrown chip is still being designed, and is alleged to be faster than Intel's Skylake CPUs by using 7nm FinFETs and other techniques. The CPU has up to 64 cores and eight memory channels. It will come in one die with three versions and four variants:
- T864 – 64 cores; 8 x DDR5/4 controllers; 72 x PCIe 5.0; 2 x 400/200/100GbitE,
- T432 – 32 cores; 4 x DDR4; 32 PCIe 4.0; 2 x 100/50/10 GbitE,
- T216 – 16 cores; 2 x DDR4; 32 PCIe 4.0; 2 x 50/10 GbitE,
- TH24 – 64 cores; 4 x DDR5 and/or 32GB HBM3 cache or dedicated memory; high-density, water-cooled
The T864 is said to be a one and two-socket Intel Xeon E5 or E7 replacement while the T432 will be a Xeon D, E3, and E5 replacement. The T216 is the entry-level version and shares the T432 die. The TH24 has the maximum floating point and AI performance. The chips can run at 4GHz.
Each of these cores is said to be faster than a Xeon and yet smaller than an Arm core – a fairly unbelievable claim – with Tachyum citing SpecInt and SpecFP 2006 benchmarks as some kind of proof.
Tachyum will port Linux and FreeBSD to the chip next year, we're told. Recompiled applications are said to run faster than they do on Xeon CPUs.
x86 and Arm binaries can run via QEMU emulation. Tachyum said this emulation hits native Prodigy performance by around 40 per cent, but claimed that a 4GHz emulated x86 core still runs faster than a real 2.5GHz Intel Xeon core.
The design is nearing tape-out stage, which should take place in 2019, when its blueprints are sent to TSMC to manufacture.
Tachyum argued that modern data centres are wall-to-wall x86 domains and often have less than 50 per cent CPU utilisation. The x86 CPUs can't run AI jobs particularly efficiently, it is claimed, and data centre owners are snapping up GPU and TPU-enhanced servers to run machine-learning workloads. If they used Prodigy CPUs instead, Tachyum claimed they could run both kinds of job on one architecture and save server space as well as energy costs.
Again, this silicon doesn't exist at the moment, so we'll have to wait until next year to see if it lives up to the hype. ®