Intel pulls up SoCs, reveals 'integrated' memory on CPUs
In future will stack memory atop the cores
Intel said it was working on stacking a layer of memory on its Xeon processors to run memory-bound workloads faster.
It said this in a pitch at the Denver-based Supercomputing Conference (SC13) which is running from 17 to 22 Nov.
According to an EE Times report, Intel's Rajeeb Hazra, a VP and general manager of its data centre group, said Intel would customise high-end Xeon processors and Xeon Phi co-processors by closely integrating memory, both by adding memory dies to a processor package and, at a later date, integrating layers of memory dies into the processor along with optical fabrics and switches.
Hazra mentioned the general memory stack idea in a 22 July presentation (PDF) and here's a slide from it:
He also told press round table attendees at the conference that the Knights Landing next-generation Xeon Phi c0-processor, with tens of cores, would have integrated memory. The concept of stacking memory dies in Xeon processor packages has come out into the open as well.
Having memory dies with the processor in a 3D package is classed by Intel as Near Memory and contrasts with DDR DRAM - Far Memory. Near memory provides faster data access.
Hamza said: "We are looking at various new classes of integrations, from integrating portions of the interconnect as well as next-generation storage and memory much more intimately onto the processor die."
The memory address space in the dies could be treated as cache or as a flat memory space or as a combination of the two. Applications would need to be altered to use such a flat memory space adjacent to the CPU and separate from the normal DRAM memory.
The amount of in-package memory would be limited by real-estate limits, the physical space inside the package, and we shouldn't expect such Near Memory to replace or substitute for Far Memory.
The in-package memory stacking would be for specific, presumably large scale, customers - Google, Facebook or Amazon-like - and therefore run counter to general X86 standards. There would also need to be data moving or tiering software to transfer data from Far Memory into Near Memory and vice versa. ®