Security

This article is more than 1 year old

Bad benchmarks bedevil boffins' infosec efforts

'Benchmark crimes' understate true performance impact of security controls

Tue 16 Jan 2018 // 00:58 UTC

A group of operating systems specialists has said that sloppy benchmarking is harming security efforts by making it hard to assess the likely performance impact of security countermeasures.

The researchers from the Netherlands and Australia, decided to take a look at the accuracy of security researchers' systems benchmark. As they explain in this paper at arXiv, security papers are littered with so-called “benchmarking crimes”.

The Register spoke to Gernot Heiser, a long-time researcher in trustworthy systems at Australian research center Data61, a professor at the University of New South Wales and also co-founder of OK Labs, which developed one of the world's first “provably secure” microkernels.

Heiser became interested in benchmarking because “I got annoyed by common deficiencies” in how security researchers validate system performance. His Dutch colleagues shared his irritation.

As Heiser explained to The Register, bad benchmarks are more than an irritant because for any security solution “you need to show two things – that the mechanism is effective; that it prevents certain classes of attacks; and that it's useable, because it doesn't impose an undue overhead.”

That makes “benchmark crimes” (a colourful rather than literal term) important, because they can make a promising fix unusable in the real world.

In their analysis of 50 papers published between 2010 and 2015 (in Usenix Security, as well as IEEE's Security & Privacy, the ACM's CCS, and papers accepted by the NDSS symposium), the researchers say they identified 22 categories of “benchmarking crimes”, ranging from ignoring performance impacts altogether, “creative overhead accounting”, using misleading benchmarks, all the way through to presenting only relative numbers in a benchmark.

Most often, Heiser said, the crime is that “evaluation data is not complete enough … you look at the 'cost' of the mechanism in a scenario, without doing a thorough evaluation of the performance effects in a representative set of scenarios”.

Take, for example, a researcher running runs the SPEC suite on systems with and without their security solution. “The suite is designed to represent a broad class of use-cases” he said, but “SPEC only makes sense if you make all the individual programs to come up with the score”.

Cherry-picking SPEC results means they're less effective: “you might pick predominantly CPU-intensive processes and ignore memory-intensive processes,” he said.

Heiser said the prevalence of benchmarking crimes is partly a symptom of the complexity of modern systems: authors might be sloppy or careless, but equally, they might have trouble understanding the implications of their own work.

“That takes a fair degree of expertise,” he said, such that even the people peer-reviewing papers don't notice the problem.

“The upshot is that you get too optimistic a picture of what you can do against a particular attack.” ®

More about

COMMENTS

TIP US OFF

Send us news

Topics

Special Features

Vendor Voice

Resources

Security

Bad benchmarks bedevil boffins' infosec efforts

'Benchmark crimes' understate true performance impact of security controls

More about

TIP US OFF

Other stories you might like

Gentoo Linux tells AI-generated code contributions to fork off

Latest AMD Ryzen Pro chips are similar silicon, more smarts

Torvalds intentionally complicates his use of indentation in Linux Kconfig

Reducing the cloud security overhead

Judge refuses to Ctrl-Z divorce order made by a misclick

Alleged cryptojacker accused of stealing $3.5M from cloud to mine under $1M in crypto

Microsoft to tackle spam by restricting Exchange Online bulk email

SIM swap crooks solicit T-Mobile US, Verizon staff via text to do their dirty work

US Equal Employment agency says Workday AI hiring bias case should continue

NASA confirms Florida house hit by a piece of ISS battery pack

Open sourcerers say suspected xz-style attacks continue to target maintainers

AI gold rush continues as Microsoft invests $1.5B in UAE's G42

About Us

Our Websites

Your Privacy