Intel's latest promise: Our first AI ASIC chips will arrive in 2019

For now you'll just have to make do with its Xeons

Naveen Rao lays out Chipzilla's plans for the future

AI Dev Con Intel announced a range of machine learning software tools and hinted at new chips on Wednesday, including its first commercial AI ASIC, the NNP-L1000, launching in 2019.

Naveen Rao, head of AI at Intel, kickstarted Chipzilla’s first AI developer conference, AIDevCon, in San Francisco. Rao was CEO and co-founder of Nervana, a deep learning startup that was acquired by Intel in 2016.


AMD, Intel hate Nvidia so much they're building a laptop chip to spite it


AI hype cycles have led to multiple booms and busts, Rao explained. At the moment, its popularity is rising rapidly and its greedily slurping all the data and compute available. The AI revolution is really a computing revolution. And everyone can join in on the AI fun, all you need is a bunch of CPUs, apparently.

“In fact, if you have Xeons today you don’t really need anything else to get started,” he said.

Intel’s whole spiel is that the tools needed for AI aren’t “a one size fits all problem”, instead solutions will come from a mixture of deep learning and more classical computational methods like random forest or regression analysis. And that can all be done with a smooth marriage of software and hardware.

So, here’s a quick recap of some of what was discussed today, starting with software:

  • MKL-DNN It stands for math kernel library for deep neural networks. It’s a list of mathematic programmes for common components in neural networks, including matrix multipliers, batch norm, normalization and convolution. The library is optimised for deploying models across Intel’s CPUs.
  • nGraphDevelopers choose different AI frameworks, and they all have their own advantages and disadvantages. In order for chips to be flexible, the back-end compiler must be able to accommodate all of them effectively.

    nGraph is a compiler that does this across Intel’s chips. Developers might want to train their model on Intel’s Xeons, but then use Intel’s neural network processor (NNP) for inference afterwards.

  • BigDLThis is another library for Apache Spark, aimed handling larger workloads in deep learning using distributed learning. Applications can be written in Scala or Python and executed on Spark clusters.
  • OpenVINOA software toolkit to deploy models dealing with videos on ‘the edge” aka IoT devices like cameras or mobile phones. Developers will be able to do things like image classification of facial recognition in real time. It is expected to be open sourced later this year, but is available for download now.

Now it gets hard

Now for the hardware part. Intel were more quiet on this front and didn’t divulge many details beyond the usual marketing babble.

“Xeons weren’t right for AI a couple of years ago, but that has really changed now,” Rao urged. Increased memory and compute means that there is now an increased performance of 100x since its Haswell chip and nearly a 200x rise for inference.

“You might have heard that GPUs are 100 times faster than CPUs. That is false,” he added. “Most inference is run on Xeons today.”

Without mentioning Nvidia at all, Rao explained that GPUs have had a great start in deep learning but are limited by severe memory constraints. Xeon has more memory and can scale to large batch sizes, so its better for inference, he said.

He briefly talked about FPGAs for acceleration and said Intel are working on a “discrete accelerator for inference”, but couldn’t share any details.

In the meantime, there is still the Intel Movidius Neural Compute Stick. It’s a USB stick that run models written in TensorFlow and Caffe and consumes about a single watt of power. It was announced last year, when Intel decided to kill its wearable gizmos like smart watches and fitness bands.

Neural Network Processor The ASIC chip was also announced last year. No benchmarks were released, and Intel just said it would be available to select customers. Not much was revealed today either. What we do know is that contains 12 cores based on its “Lake Crest” architecture. It has a total of 32GB memory, has a performance of 40 TFLOPS at an undisclosed precision, a theoretical bandwidth of less than 800 nanoseconds for 2.4 Terabits per second of high bandwidth for low-latency interconnects.

Finally, the NNP L1000. Even less was said here. This will be the first commercial NNP model, and will be available in 2019. It’ll be based on the new Spring Crest architecture, and is expected to be three to four times faster than the previous Lake Crest model. ®

Sponsored: Minds Mastering Machines - Call for papers now open

Biting the hand that feeds IT © 1998–2018