This article is more than 1 year old

Google Native Client: The web of the future - or the past?

This time, it's Mozilla v Google

A sandbox twenty years in the making

Native Client is inspired by a research paper published by a team of academics at the University of California, Berkeley in the early 90s. The paper described a means of sandboxing native code known as "software-based fault isolation". In essence, this system used a special instruction sequence to ensure that machine code programs conformed to certain security properties, and then it could analyze the resulting code to determine whether those properties were indeed followed.

Brad Chen

Brad Chen

The trouble is that the technique wasn't feasible for use with the x86 instruction set used by modern Intel and Intel-compatible processors. But Google changed that, in part by tapping the instruction set's rarely-used "segment registers". Expanding on a second paper from MIT and Harvard academics Stephen McCamant and Greg Morrisett, Google produced a version of the system that not only worked with x86, but did so with low overhead, letting native code run fairly close to its original speed. "They took the idea and really made it real," Harvard professor Morrisett tells The Register. "They have a very good handle now on how to do this stuff."

With the 32-bit x86 instruction set, Native Client uses the segment registers to restrict where in memory a program can read and write data and to ensure that a program doesn't jump to code outside a certain range of memory. But it also includes a modified compiler and a code verifier that work to keep code jumps in line.

An ordinary program will read a data value from memory into a register and then jump to the address that value represents. But with Native Client, the compiler performs a bit of arithmetic on that value before the jump to ensure it doesn't target bad instructions, and then the code verifier double-checks the compiler's work.

Among other things, the verifier must ensure that a program doesn't leap past the extra instructions inserted by the compiler. "You've added this arithmetic to make sure the jumps are in range. But the real issue is that if it's really clever, a program will arrange for a jump to jump past that arithmetic," says Morrisett. "You might protect one jump but not the next one."

Greg Morrisett

Greg Morrisett

Originally, it was difficult to write such a verifier for x86 because the lengths of the instructions vary. If the lengths aren't the same, it takes far too long to find rogue instructions. As you parse the sequence of raw bytes looking for these bad boys, you have to assume that they could start at any given byte.

But McCamant, then a grad student at MIT, realized this problem could be eased by "padding" the instructions with dummy bytes. "On a RISC machine [where all instructions are the same length], there's only one parse. But on an x86 machine, there can be multiple parses, and it was infeasible to check all of them," says Morrisett. "The trick was to force the instructions to be aligned in a certain way." And Google ran with the idea. In addition to boosting the performance of the system by way of the x86 segment registers, Google improved on the speed of McCamant's padding method.

The games Google plays

Like any other piece of software, Native Client will contain its own security holes, and according to Morrisett, this is a particular worry with the code verifier. In 2009, Google ran a contest to identify bugs in Native Client, and yes, some were found. But Morrisett now describes the system as "pretty robust".

The larger point is that in providing software-based fault isolation on x86, Native Client still maintains the speed of native code – or thereabouts. According to Morrisett, the sandbox adds only about 5 per cent overhead, which gives Native Client a significant advantage over JavaScript.

"It gives you a tremendous improvement in performance compared to other options for running code in the browser," he says. "It makes it possible to do serious rendering and serious number-crunching in the browser without a big performance hit."

Specifically, Native Client can provide more in the way of parallel execution than JavaScript, and it allows for much faster vector arithmetic. For Google, it's a way of opening the browser to 3D games, video editing, and other applications that require that extra bit of oomph, and such applications can be moved to the platform with relatively little effort. Existing native applications must be ported for use on Native Client, but developers needn't start from scratch. They can use their existing codebase.

"[Native Client] gives you a tremendous improvement in performance compared to other options for running code in the browser. It makes it possible to do serious rendering and serious number-crunching in the browser."

– Greg Morrisett

This proposition fits quite nicely with Chrome OS, the fledgling Google operating system that puts all applications inside the browser. With Chrome OS, running existing 3D games and other desktop applications isn't really an option. But the Native Client project pre-dates Google's operating system effort, and the ultimate goal is to bring a new breed of applications to the entire web.

"JavaScript now does integers well, but JavaScript engines haven't really gotten around to vector arithmetic yet, and that's where native code currently tends to have a big advantage...These things aren't important to all software, but when they are, it can mean one or two orders of magnitude in performance," says Google's Linus Upson.

"The sweet spot is 3D games. They do a lot of computation, and the codebase already exists in C++, so they don't have to be rewritten. It allows us to get a lot more fun and exciting games on the web."

The rub is that Native Client isn't the web – at least not yet. It will soon be an integral part of Google's browser and its browser-based operating system. But that's a little different.

More about

TIP US OFF

Send us news


Other stories you might like