Run a JSON file through multiple parsers and you'll get different results every time
Loose spec mangles data, creates security risks
The ubiquitous message-passing JSON format is something of an untended garden with plenty of security and stability traps for the unwary.
That warning comes from software engineer Nicholas Seriot, who last week presented his work on JSON parsers to an audience at Geneva's Soft-Shake Conference.
The problems arise because although JSON's inventor, Douglas Crockford, wanted to create something that was concise and unchangeable, the world didn't agree. There are now six documents that describe it, with differences between all of them; and as a result, no two parsers are quite alike.
In this (extremely tightly-packed) post, Seriot created a suite of test files, fired them at a bunch of parsers, and found that “edge cases and maliciously crafted payloads can cause bugs, crashes and denial of services, mainly because JSON libraries rely on specifications that have evolved over time and that left many details loosely specified or not specified at all.”
Seriot was thorough – the parsers he fired his tests at included:
- Five parsers written in C – JSMN, Jansson, CCAN, cJSON, and JSON-parser;
- Three Objective-C parsers – JSONKit, TouchJSON, and JSON-Framework;
- Three Java parsers – Gson, Jackson, and Simple JSON;
- Apple's JSONSerialisation parser;
- The Freddy parser written in Swift;
- A Bash script, JSON.sh; and
His basic assessment: “out of over 30 parsers, no two parsers parsed the same set of documents the same way”.
The full results are here; a “red” entry in the table means Seriot's test crashed the parser (bad, because crashes can lead to exploits). A “brown” entry – “parsing should have succeeded but failed” – is also dangerous, “because an uncontrolled input may prevent the parser to parse a whole document”. ®