Google open source guru: 'Why we ban the AGPL'
Mountain View crawlers spot 31 million open source projects
Google open source guru Chris DiBona says that the web giant continues to ban the lightning-rod AGPL open source license within the company because doing so "saves engineering time" and because most AGPL projects are of no use to the company.
The Affero GPL is designed to close the so-called "application service provider loophole" in the GPL, which lets ASPs use GPL code without distributing their changes back to the open source community. Under the AGPL, if you use code in a web service, you required to open source it.
Google, of course, is a service provider, and the core components of Google's famously distributed back-end infrastructure – which run all its online services – are not open source. The company once banned the AGPL from Google Code, its hosting site for open source projects, but last September, it reversed course and said it would allow AGPL projects back onto the site, along with any other license approved by the Open Source Initiative. Within Google, however, the license is still banned.
Speaking on Wednesday morning at NASA's inaugural Open Source Summit at the Ames Research Center in Silicon Valley, DiBona said the company's AGPL ban is "more of a procedural issue than anything else".
"If you look at the interior of Google and how we make software, we don't launch a lot of software to the outside world," he said. "With the AGPL, you have to be very, very careful with how it is expressed. Otherwise you have to invoke the sharing in many different places. [The ban] is really about saving engineering time."
He also indicated that there are no AGPL projects that the company really needs. "This might sound a little jerky, but a lot of the [available] AGPL software, we don't need to use," he said. "It addressed areas where we already have software. So there isn't a lot of call for it."
MongoDB is probably the most prominent AGPL project, he said, but it replicates software already used within the Google back-end infrastructure. Google uses a proprietary custom-built distributed database known as BigTable. The company has repeatedly indicated that it will not open source BigTable, but it has published a research paper on the platform, and its basic ideas are now used in open source platforms such as Hadoop HBase and Cassandra.
In similar fashion, the company has not open sourced core platforms such as the Google File System (GFS), its distributed file system, and MapReduce, its distributed number-crunching platform.
At the beginning of his talk, DiBona said that according to Google's net crawlers, the web now contains over 31 million open source projects, spanning 2 billion lines of code. Forty-eight per cent of these projects are under the GPL, 23 per cent use the LGPL, 14 per cent use the BSD license, 6 per cent use Apache, and 5 per cent use the MIT license. All other licenses are used with under 5 per cent of the projects. Google's preferred license, DiBona reiterated, is Apache, because "it has patent grants that are fair." Unlike the GPL, Apache has no copyleft requirement, meaning those who use Apache code needn't distribute their changes back to the community.
Asked what he thought of NASA's NOSA open source license – the NASA Open Source Agreement license, which also lacks a copyleft provision – DiBona said that the only problem he has with it is that it "creates a pool of incompatible software." DiBona has long said that Google does not believe in license proliferation. "The ten most popular open source licenses should be plenty," he said on Wednesday, "and having additional licenses is a waste of time." DiBona used the same argument in previously banning the AGPL from Google Code.
We attempted to ask DiBona about Google's recent decision not to open source the new Honeycomb tablet incarnation of Android "for the foreseeable future", but he declined to answer questions as he was needed in an apparent powwow with NASA lawyers.
Sponsored: Beyond the Data Frontier