Raw power? More like core power: How and why you need to accelerate AI, at the edge and in the backend
From 5G to retail, nothing can escape the rise of machine learning
Sponsored Enterprise adoption of artificial intelligence (AI) increased 270 per cent in the past four years according to research firm Gartner’s 2019 CIO Survey. And many organisations are now just starting to understand and implement the technology in a variety of applications to supplement parallel process automation and digital transformation initiatives.
Deloitte’s State of AI in the Enterprise report suggests that by 2020 the penetration of enterprise software with AI built in, and cloud-based AI development services, will reach 87 per cent and 83 per cent, respectively, while a McKinsey survey conducted in November 2018 estimates 71 per cent will increase their spending on AI in the coming years.
Much of that investment will be spent on robotic process automation (RPA), computer vision, and machine learning (ML) applications, with AI and ML also playing an increasing role in data analytics and cyber security.
Next-gen CPUs optimised for AI/ML
Resource-intensive AI workloads need a lot of processing power to handle the inference and training processes that applications use to learn and adapt from the vast amount of data they collect and analyse. The typical approach to providing that CPU muscle is to utilise on-demand infrastructure-as-a-service (IaaS) capacity provided by large service cloud providers and high-performance compute (HPC) server architectures, supplementing them with graphics processing units and other accelerators as and when needed.
The last few years have seen a new generation of hardware start to emerge, destined not only for more efficient servers hosted in super-scale cloud data centres but also smaller purpose-built devices which can be located elsewhere, especially at the edge of the network. Established silicon makers have moved to deliver CPUs customised for the unique requirements of AI and ML workloads while venture capital has poured into startups designing and building their own.
The new family of 10-nanometre Intel® Agilex™ FPGAs, optimised to accelerate AI/ML application performance and lower power consumption, are expected to start sampling later this year. Flex Logix, too, will introduce a field programmable gate array (FPGA) chip – InferX – that allows companies to port their existing AI models into the silicon within roughly the same timeframe.
The latest generation of Intel® Xeon® Scalable processors have also been enhanced with Deep Learning (DL) Boost, a feature formulated to accelerate artificial-intelligence workloads such as image recognition, object detection, and image segmentation, and were developed in collaboration with innovative DL/AI software framework developers, including Google-owned TensorFlow as well as PyTorch, Caffe, MXNet, and Paddle Paddle.
Early adopters testing AI workload acceleration
Yet the key to encouraging greater use of AI/ML in a broader set of enterprise data analytics use cases probably lies in making the technology much easier for smaller companies to adopt and mould to their own applications, services, vertical sectors, and business model. Sri Satish Ambati is chief executive and cofounder of open-source machine-learning outfit H20, formerly known as Oxdata. He points out that AI acceleration is a more cost-effective way of analysing large data sets in the financial services industry for fraud detection and credit scoring.
“There is a crunch between time, talent, and trust in AI as early adopters operationalise machine-learning workloads as part of their data-science projects,” he said. “That is where we can see a really strong hardware and software play – most companies have GPUs but want to prototype platforms on CPUs.”
Deep learning and AI are also having an impact in the physical security market, particularly video surveillance systems that collect vast amounts of real-time footage and apply object classification to learn the difference between people, animals, birds, vegetation, vehicles, and other objects with a high degree of accuracy.
Agent VI has built its own distributed video analytics platform that processes data at the device itself (usually the IP camera or the encoder) and a server located either in the cloud or on-premises. It used Intel’s Open VINO toolkit to utilise the silicon maker’s GPUs to improve AI inference in the cloud, and is now pushing a lot of that processing to the edge.
Taking that approach allows it to reduce the cloud storage, GPU, and bandwidth costs – benefits the company has been able to pass onto its own customers in the form of lower subscription fees thereby driving greater adoption.
AI and ML in security
Cybersecurity is another area of the IT industry where intelligent use of hardware-optimised AI and ML can deliver time and cost benefits. Using CPU-embedded hardware acceleration alongside machine learning can help trawl through massive volumes of network traffic to identify suspicious patterns in real time, which may indicate a cyber attack is underway or imminent, for example.
Based on known threat models and complex analysis of historical data, the AI engine can recognise what constitutes normal system behaviour and what abnormal behaviour looks like, too. And that provides an early-warning system, which gives security analysts the time to take preventative measures that may halt the attack or limit the damage it can cause before being neutralised.
“People in financial services have to do fraud detection, and for that they need an analytics tool that deals with hundreds of petabytes of data,” said Brian Bulkowski, founder and advisor at database company Aerospike. “They have data scientists crafting new queries to do that, but they now also use meta-optimisation as part of AI.”
AI at the network edge
With so much data to process, bandwidth constraints sometimes mean it is impossible (or at least cost prohibitive) to send it all back to the data centre for storage and analysis. That means some organisations – in telecommunications, financial services, transportation and retail, for example – have to perform complex AI workloads closer to the source of the information, including the network edge, if they can get the right CPU muscle there to handle it.
“At the moment, AI solutions are fairly unwieldy platforms that sit in a back room somewhere and need data scientists to access them,” said Alex Lam, vice president and head of the North American Strategy Office for Fujitsu. “But once fifth generation (5G) [mobile] networks arrive, we will see a decentralisation of AI that moves data closer to the IoT and the edge. Then we will see the promise of what it [AI] can deliver as it becomes fully integrated into our homes, vehicles, and cities.”
Prasanna Sundararajan is founder and CEO at startup Reniac, a company that designs software to accelerate data and network traffic in public and private cloud-hosting architectures. He, too, is beginning to see the emergence of edge processing in tandem with the spread of both 5G networks and Internet of Things (IoT) connectivity.
“Why not start filtering [data from] nodes and devices so you would be able to process, ingest, and store it at the edge? Couple them close to data, and make a decision about what gets moved into the cloud – that is what we are hearing from geographically distributed customers,” he said.
New CPUs designed for edge deployment
The problem for edge hardware manufacturers is that enterprise customers need a lot of AI/ML processing power packed into devices destined for potentially insecure environments where the availability of space and electricity is often limited.
Vik Malyala is the senior vice president in charge of field applications engineering and business development for systems board manufacturer SuperMicro. His company is currently building components for edge equipment used in multiple verticals, including telecommunications and retail. “We are talking about power consumption and resilience – it [the device] needs to run by itself for a long period of time, so all good portions of hardware engineering need to come to the [network] edge,” he explained. “Retail uses lots of small appliances, and some collect data and run analytics ... Suddenly there is a need to do more compute, more quickly.”
Silicon manufacturers have stepped up to the plate here, designing chips to optimise power usage, heat production, and space requirements in restricted environments.
An Intel® Xeon® CPU family aimed specifically at edge computing deployments was released last year. The six chips in its D-2100 system-on-a-chip range feature four to 18 cores and 2.3GHz to 2.8GHz clock speeds, and are well suited for integration in servers embedded in mini data centres, smart cars, mobile phone masts and other infrastructure devices. Google, meanwhile, produces an Edge tensor processing unit (TPU), a low power, IoT orientated version of its Cloud TPU.
Choice of architectures to suit different AI needs
Ultimately there is no cookie cutter approach to processing AI/ML workloads. One size definitely does not fit all – individual enterprise applications will need different hardware capabilities depending on the specific data sets they process, associated performance requirements, business metrics, and the regulatory environments they operate in. The good news is that silicon manufacturers have expanded their portfolio to better address the array of usage scenarios now on the table, presenting hardware makers with a greater choice of architectures to match to their customers’ needs.
Sponsored by Intel®.