Googlegate: Mapping a scandal of global proportions
Google is not above the law
Opinion While the rest of us have generally been enjoying the sunshine and warm weather for the past few weeks, there has been a permanent cloud over Mountain View, as the storm over Google's capturing of Wi-Fi content with its Street View cars has developed.
That storm now threatens significant reputational damage to Google, not least because dozens of countries are considering initiating criminal prosecutions against it and indeed a number of police investigations have already begun.
On April 22nd 2010, news broke that Google's Street View cars had been surreptitiously collecting Media Access Control (MAC) addresses and Service Set Identifiers (SSID) from Wi-Fi networks as they roamed the planet taking photographs of our houses.
Street View has been contentious enough from a privacy perspective, with many people concerned about the dangers such activity presents, and has been in the headlines frequently. But once it was discovered that Google was capturing Wi-Fi identifiers as well, the controversy snowballed.
Some people don't see the problem - they contend that the data Google was collecting is harmless and that the fuss is all about nothing. As a privacy advocate, one does not have the liberty to be restricted to such a narrow field of vision.
We all need to understand that Google already has an overwhelming quantity of data on a significant percentage of the global population, so having the ability to now marry that existing data with geo-location data gives the search giant even more insight into who and where we are.
We accept that some people really don't care if Google has all this data and information on us, but at the same time many of us do care, many people find it offensive and many people feel they have no control over that data or how it is used.
One can talk about human rights or countless other legislative measures designed to protect our privacy, but at a fundamental level it should be pretty obvious that if you wish to leverage commercial value from private and personal data, it should be done ethically and with consent. This is not because the law states it should, but because it is simple common courtesy and illustrates a level of respect which in turn leads to stronger confidence that such data will not be abused or used inappropriately. One can hardly expect people to trust that data is safe from abuse if the organisations collecting that data are doing so in such an underhanded and clandestine manner. This is not the way to instil confidence and is likely to cause damage to a brand's reputation.
That said, had the collection been limited to just MAC addresses and SSID it is likely that by now the storm would have blown itself out and Eric Schmidt would probably be relaxing by one of his pools marking the incident up as another victory illustrating the strength of his brand.
However, within three weeks the scandal gained new traction when Google admitted via its blog that it had also been soaking up the actual contents of unencrypted Wi-Fi communications with its Street View data sponge.
This was a much more serious issue and it was clear from the disclosure that Google knew this as it immediately apologised, calling the collection an accident. This is significant because intercepting and retaining those communications is in many regions a criminal act, so it was critical that Google attempt to mitigate the situation by denying intent – an important factor in assessing a case for criminal prosecution. We were immediately unconvinced that this activity could have been carried out accidentally and having been involved in large technology projects for the better part of fifteen years, it seemed untenable to me that this “rogue code” could have found its way into the project and been deployed without anyone knowing it was there. Within ten minutes of Google disclosing this information on its blog, we released our response on our web site.
Then on my blog I explained the basic principles of project development and deployment in the IT sector, discussing a number of core stages that such projects would generally go through. It was not a specialised view and I accept that many projects may differ in many ways, but those four core stages of design, development, testing and deployment are pretty much the standard framework for all large-scale technology projects.
With that in mind it is clear to see that at some point this code should have been noticed. At the design stage technical specifications should have been written which would have been used to determine the scope and functionality of the project by the development team. It is absurd to suggest that the development team would then create software outside the boundaries of those specifications. It simply doesn't happen that way and no amount of protest by Google will lead me to believe otherwise.
But even if we give Google the benefit of doubt at this stage, the testing stage of the project would use these same technical specifications to audit the data coming back from their simulated tests. Any data which could not be explained by those technical specifications would raise alarms and be investigated. That is the whole point of testing software before it is deployed - to ensure that it is doing what it was designed to do and that it is stable.
Next page: You design software how?
You may want to revisit your section on software development. What you describe is a view in principle of a very rigid development process, as might be designed by an auditor. It bears little to no relation to how things work in the real world.
Let me point out a few of your biggest errors:
"It is absurd to suggest that the development team would then create software outside the boundaries of those specifications."
No, it's not. The general consensus seems to be that developers shine when given opportunities to push boundaries. Now in shops that develop software for public consumption, that (sometimes) takes second place to producing a quality product, but for internal development for companies that are pushing to be on the cutting edge, they'll allow their developers a much longer leash.
"Any data which could not be explained by those technical specifications would raise alarms and be investigated. That is the whole point of testing software before it is deployed - to ensure that it is doing what it was designed to do and that it is stable."
The first does not follow from the second. The point of testing software is to ensure that it does what it was designed to do,and that it is stable. But very rarely does testing reach to proving that the software does NOTHING BUT what it was designed to do, which is the gist of your first sentence.
"But in the interests of objectivity, even if we accept that this code was not noticed during the testing stage (which really is stretching the realms of possibility), once a project has been deployed testing continues on live data. This is important because once a project is deployed in the real world it often behaves differently to how it behaves in a lab environment. Resource efficiency needs to be checked, external factors need to be controlled or at least mitigated and data has to be accurate. This means that even if all the above stages failed to notice the data being generated by this code, once in a live environment it would be impossible to miss."
This is the one which proved to me that you don't know the real world of software development AT ALL. Deployed projects do have testing, but that is usually reduced to minimal amounts to avoid causing performance problems. Resource efficiency monitoring and error logging would be about it. Given the likely relative sizes of the different types of data being collected, most compression effort and troubleshooting of space usage would likely be focused on the photographic component.
Your entire piece also entirely ignores one standard development practice which goes a long way to explain how code ends up in projects without the managers ever knowing it's there. And it's this practice that Google themselves claim caused this issue: the use of external libraries. Google's story is that the Wi-Fi library they used in the StreetView project was developed in their labs as an experimental project, and was included by the StreetView development team because it did what they needed, and they were either unaware or unconcerned that it collected more data than they needed. I'm not saying that I buy this story, but the fact that you don't even mention it puts a huge question mark on your understanding of this issue.
Finally, there is the issue of patents:
"Then on June 3rd 2010 as a result of ongoing class action suits in the US it emerged that Google had filed a patent application for similar technology in 2008, this reinforced our opinion that this could not have been rogue code. In order for a patent application to be filed, it seemed obvious to us that Google's legal department would have had to review the technology and submit the application. This also would suggest that the project had been funded which in itself would require the attention of managers, designers, developers and testers."
Software development companies try to patent EVERYTHING THEY DO -- even experimental stuff that they have no intention of actually using. They do it because they know every other software development company is trying to patent everything THEY do, and patent portfolios are used both offensively and defensively in this business. So the fact that Google applied for a patent means only that they developed the software, not that they ever intended to use it.
Your determination that Google did this deliberately is based on some very flawed (some might say naive) views of software development. There also seems to be some indication of bias -- you seem to be avoiding any points that lessen your case that it was deliberate.
I don't agree with what Google did, and I don't know whether they did it deliberately or not. But I know that you have not done the analysis necessary to determine whether or not they did it deliberately.
I agree with him.
Judging by this article there is no way that the Google project people didn't know that the Wi-Fi data was being collected and processed.
It is obvious that this information is of immense value to Google.
It is also obvious that they were not going to tell anyone they had this data, and they had to be forced on the issue.
Do they really believe their motto of 'do no evil'?
I think Google should have to setup an escrow account used to compensate all members of the U.K. whose data they intercepted.
Roughly $20 billion ought to cover it!