Intel doubles throughput, slashes power to stave off DATAPOCALYPSE
With 750 gigabytes/sec looming, strong measures must be taken
Research@Intel Intel researchers have developed a prototype interconnect that they say will both increase bandwidth and lower power requirements, whether used for simple CPU-to-CPU connections or scaled up to connect tens of thousands of CPUs in a data center.
"Based on past trends, every four years the bandwidth requirements for systems is increasing by an order of magnitude," Intel senior research scientist Mozhgan Mansuri told The Reg at the annual Research@Intel dog and pony show in San Francisco on Tuesday.
If this trend continues, she said, by 2020 bandwidth needs will be in the range of 750 gigabytes per second. And since today's most efficient interconnects require about 10 or 15 picojoules per bit, the juice required to power that bandwidth will be about 50 per cent of the whole system in 2020, "which is unacceptable."
Mansouri describing this slide showing the growing power needs of I/O: 'There's a lot of assumptions
there, but the point is that if we don't do anything about the I/O power, we're getting into trouble.'
The solution that Mansouri and her colleagues have come up with is called, rather straightforwardly, "Scalable Energy Efficient I/O". Their setup consists of 64 bidirectional lanes controlled by a global clock – and it's that clock that is key to their system's efficiency, Mansouri said.
The global clock takes advantage of the fact that although systems require more and more peak bandwidth, most of the time they're not running at that rate, but instead they're perhaps idling or running workloads that require only half that peak.
If you want to conserve power in current implementations, she said, you simply turn off lanes that aren't needed, but keep the clock at the same rate – if you only need half the bandwidth, you turn off half the lanes. "So in that case, you just get the linear power saving – your bandwidth goes down by half, your power goes down by half."
In Scalable Energy Efficient I/O, if you want to halve the bandwidth of the interconnect, instead of turning off half the lanes, you keep all the lanes running but drop the throughput of each of the lanes. By doing so, the power requirements drop in a curve that's notably steeper than mere linearity. "So potentially I get much better power saving than the linear bandwidth of scaling gives me," Mansuri said.
The power savings and the throughput can can both be considerable, she said, showing slides that gave examples such as an 8-link, four-socket blade using a 32-lane setup for her team's design consuming 22 watts and providing 8 terabits per second throughput, compared with a 32-lane PCIe 3.0 setup consuming 62 watts and providing 4 terabits per second throughput.
Twice the throughput at less than half the power; what's not to like except the price tag?
Another slide imagined a data center with 100,000 links, in which a prototype 32-lane Intel system consumed 270 kilowatts to provide an aggregate of 100 petabits per second, while a 32-lane PCIe 3.0 setup consumed 770 kilowatts to pump along 50 petabits per second.
Mansuri emphasized that the prototype which she showed us humming along nicely was just that – a prototype – and that much work needs to be done both in its design and its manufacturability at a reasonable cost. "So this will be more attractive to the high end," she said, "and that's why we're looking to the data center." ®