Amazon: Intel Meltdown patch will slow down your AWS EC2 server

Sysadmins notice performance dip amid security fix rollout. Not everyone hit hard. YMMV etc

By Shaun Nichols in San Francisco

Posted in Cloud, 4th January 2018 22:37 GMT

Amazon AWS customers have complained of noticeable slowdowns on their cloud server instances – following the deployment of a security patch to counter the Intel processor design flaw dubbed Meltdown.

Punters said that, since AWS shored up its infrastructure, and began rolling out its Meltdown-patched Linux in December, they have noticed an increase in CPU utilization by their EC2 virtual machines. The solution is to either optimize application code running on the VMs, or move to a more powerful and expensive virtual machine to take the extra load.

Amazon has said it will help those suffering slower-than-expected performance.

To be clear, your humble vultures here at El Reg highly recommend you apply the Meltdown patches on your Intel-powered systems: the processor bug allows user processes to read passwords, keys and other sensitive data out of the kernel's protected memory area.

The software fixes – which are available for Linux, Windows, and macOS on Intel CPUs – move the operating system kernel into its own separate virtual memory space, protecting it from Meltdown exploits. The downside is that this introduces extra overhead, potentially slowing down the system.

The performance hit varies wildly depending on the type of applications you're running. Casual desktop users and gamers applying Meltdown mitigations on their computers shouldn't notice any slowdowns. Light installations, such as simple web servers, will be mildly affected. Machines hammering disk storage, slamming the network, or otherwise making lots of system calls, may experience up to 30 per cent reduction in performance. Your mileage may vary.

AMD processors are not affected by this particular design cockup.

A discussion thread in the AWS support forums details dips in performance that occur after rebooting Linux virtual machines with the Meltdown workaround – dubbed Kernel Page Table Isolation, or KPTI, on Linux – installed.

"Immediately following the reboot my server running on this instance started to suffer from CPU stress," one admin noted after enabling the patch.

"Looking at CPU stats there was a very clear change in daily CPU usage pattern, despite continuing normal traffic to my server. I performed extensive review of what might have changed on my server configuration but drew a complete blank - configuration of the server did not change."

Another added: "This just happened to us today on a c3.large. The cost to us to move the platform to new hardware and the lost confidence from our customers is huge."

Developer Tim Gostony was also able to record how defending against Chipzilla's design blunder impacted the performance of two of his Intel-powered EC2 Linux instances.

AWS confirmed that the potentially-performance-limiting update the users spoke of was its fix for the kernel memory bug that afflicts the Intel CPUs it uses for the EC2 service. This low-level hardware vulnerability was discovered by researchers who privately alerted Intel in June 2017. Operating system-level workarounds were quietly developed, and rolled out on AWS from December. On Tuesday this week, word of Intel's insecure speculative execution engine, at the heart of the security flaw, emerged.

On Wednesday, a collective of researchers went public with details of Meltdown, plus a related set of processor security holes dubbed Spectre – which also hits Intel chips, plus some AMD and Arm cores.

A drop in CPU performance is particularly troublesome for cloud compute subscribers where providers bill by the hour or second. When workloads take longer to run, customers end up paying more over the long run.

AWS told El Reg it will be reaching out to customers who notice a slowdown to help get performance back up to pre-patch levels.

"We don’t expect meaningful performance impact for most customer workloads," the cloud giant said. "There may end up being cases that are workload or OS specific that experience more of a performance impact. In those isolated cases, we will work with customers to mitigate any impact."

Meanwhile, Microsoft Azure is deploying Meltdown defenses, and Google's Compute Engine is secured. Check with your cloud provider for the latest on its response to Intel's engineering gaffe. The slowdown problem is not limited to AWS: you may experience a performance hit on other clouds. If so, this is why. ®

Sign up to our NewsletterGet IT in your inbox daily

56 Comments

More from The Register

Qualcomm joins Intel, Apple, Arm, AMD in confirming its CPUs suffer hack bugs, too

Just in time for Friday night

Intel is upset that Qualcomm is treating it like Intel treated AMD for years and years

Chipzilla takes number, joins queue to kick Snapdragon biz in the ball arrays

Intel ponders Broadcom buy as Qualcomm's exec chair steps away

Rather than face a combined BroadQual, Chipzilla may break out the cheque book

Intel to Qualcomm and Microsoft: Nice x86 emulation you've got there, shame if it got sued into oblivion

Chipzilla sends not-so-subtle threat to ARM crew

Europe waves through Qualcomm's NXP slurp

Chip-maker promises to play nice with others to secure deal

Intel, Samsung join Apple, FTC firing squad against rival Qualcomm

Two more chip heavyweights enter the patent fray

Get ready for laptop-tab-smartphone threesomes from Microsoft, Lenovo, HP, Asus, Qualcomm

Analysis Snapdragon Win 10 PCs declaration of war on Intel

A bit of intel on AMD's embedded Epyc and Ryzen processors

Dips Zen toes into embedded world with hot new SoCs

Qualcomm disappointed by Broadcom's 'inadequate' shrinking package

Snapdragon giant confirms: Size really does matter

FYI: Processor bugs are everywhere – just ask Intel and AMD

More chip flaws await