When algorithms attack, does Google hear you scream?
Inside Google's search penalties gulag
The algorithms versus protected free speech
Photo: Library of Congress
Google's opinions on what is and is not 'quality' are clearly changeable, but they're also debatable - you might wonder what the point of providing added value content and services is, in certain circumstances. If, say, you're looking for the absolutely cheapest flight to New York, then you're going to want a really good flight comparison site. If a site provides you with that cheapest flight quickly and easily, what is it that it has done that merits Google search penalties?
Compare such a hypothetical site with an equally hypothetical travel site which offers a range of travel-related services plus city guides, hotel and restaurant reviews. This is a perfectly legitimate business model designed to induce users to hang around and buy more stuff, just like Amazon - but it's not necessarily 'better' than a site that performs a single service, and only a single service, very well.
In Raff's view: "Publicly, Google often emphasises the value of original content while rarely acknowledging the value of service. This is a convenient line for Google — first, because Google requires third party content to hang its ads on, and second, because it helps to foster the view that rival search services have little innate value."
Even if you don't agree with her on how convenient it is for Google to value (somebody else's) original content and discount the importance of service, there is an argument that its doing both of these things is based not on machine intelligence, but very human prejudice.
The general impression that Google's search results are entirely automated and objective extends to people thinking they're ranked by how true they are. An aura of saintly impartiality is certainly helpful in the Department of Definitely Not Evil, and Google's own execs reinforce the 'untouched by human hand' legend, even as Google's own explanatory statements shift beneath them, and Google's own lawyers contradict them.
Last year, for example, Udi Manber, Google VP for Search Quality, told Popular Mechanics: "At Google we do not manually change search results... we have to find what weakness in the algorithm caused [an unsatisfactory] result and find a general solution to that, evaluate whether a general solution really works and if it's better, and then launch a general solution. That makes the process slower, but it puts a lot more discipline on us and makes it more unbiased."
And later that year, Google Fellow Amit Singhal stressed "no manual intervention... We are using all this human contribution through our algorithms. The final ordering of the results is decided by our algorithms using the contributions of the greater Internet community, not manually by us.
"We believe that the subjective judgment of any individual is, well ... subjective, and information distilled by our algorithms from the vast amount of human knowledge encoded in the web pages and their links is better than individual subjectivity."
Neither of these statements is untrue as such, but they are grossly misleading. Google has argued (in response to Searchking's 2002 action) that its page rankings are its "view or opinion", protected by constitutional provisions regarding free speech.
Defending against Kinderstart in 2006, Google's lawyers argued that "PageRank constitutes Google's subjective opinion concerning the relative importance of a website", and explained further: "Even if PageRank were entirely determined by an algorithm, which it is not, the creation of that algorithm would reflect the Google programmers' subjective assessment of the factors that lend to a website's relative significance and the weight to be accorded each factor... Google's PageRank of a website often differs from the rankings assigned by other search engines, which would not be the case if a ranking were an objective fact." (Our emphasis. See Google's response to Kinderstart).
So although Google's spokesman won't comment on whether Google's results are 'right', or whether they're just the company's opinion, Google's lawyers point to differences in ranking from one search engine to another as proof that they are opinion.
Of the two Googlers quoted above, Manber is the more enlightening. Singhal tells us that it's all down to the algorithm, while Manber states that Google has as a matter of policy determined that direct manual changes are not possible, which forces general solutions to be developed via algorithms which - the lawyers inform us - are manipulated by humans.
Singhal's reference to "information distilled by our algorithms from the vast amount of human knowledge..." seems particularly dubious against this backdrop. And as we've already seen that direct manual changes are possible via whitelisting, we must presume Manber is talking about the ideal rather than the actuality.
Objective measurement? Did we say that?
In feeding the legend of the algorithm the execs are only the front end. In 2006 after Google's Kinderstart defence had argued that rankings were Google's subjective opinion, the Google Technology Overview said (contemporary eye-witness here): "PageRank performs an objective measurement of the importance of web pages by solving an equation of more than 500 million variables and 2 billion terms.
"PageRank also considers the importance of each page that casts a vote, as votes from some pages are considered to have greater value, thus giving the linked page greater value. Important pages receive a higher PageRank and appear at the top of the search results. Google's technology uses the collective intelligence of the web to determine a page's importance. There is no human involvement or manipulation of results, which is why users have come to trust Google as a source of objective information untainted by paid placement." (Our emphasis.)
The page was changed in May 2007, and now reads: "PageRank reflects our view of the importance of web pages by considering more than 500 million variables and 2 billion terms. Pages that we believe are important pages receive a higher PageRank and are more likely to appear at the top of the search results.
"PageRank also considers the importance of each page that casts a vote, as votes from some pages are considered to have greater value, thus giving the linked page greater value. We have always taken a pragmatic approach to help improve search quality and create useful products, and our technology uses the collective intelligence of the web to determine a page's importance."
One must surely conclude that Google itself no longer believes PageRank is an "objective measurement", for it is now merely "our view", and "pragmatic approach" could mean (or allow) almost anything. Also gone are the commitment that "Important pages receive a higher PageRank and appear at the top of search results" (which is a worry), the total lack of human involvement, and humanity's blind faith in Google as a source of objective information.
All in all, an impressive piece of self-flagellation. Apart, that is, from the stuff about the technology using the collective intelligence of the web - the unwary might take this to mean that Google is still the product of a God Machine unsullied by human fallibility.
Similarly, where this page now says a site's ranking "relies heavily on computer algorithms", it previously said it was "automatically determined by computer algorithms." The results from the particular search referred to on this page do seem to have changed markedly since it was first introduced, and it's possibly also worth noting that the wording: "The only sites we omit are those we are legally compelled to remove or those maliciously attempting to manipulate our results", noted by The Reg in 2004, seems to have disappeared.
We suggested to Google's spokesman that these various changes could be taken to mean that "there is now human involvement and manipulation of results, that Pagerank in simply Google's editorial view, and that Google omits sites as and when it chooses, for reasons of its own choosing." Asked to confirm or deny this, he responded: "a site's ranking in Google's search results relies heavily on computer algorithms using thousands of factors to calculate a page's relevance to a given query...[but]...we will remove pages from our results if we believe the page (or its site) violates our Webmaster Guidelines, if we believe we are required to do so by law, or at the request of the webmaster who is responsible for the page."
Next page: Redmond Building 666 - where MS went wrong