Feeds

Revealed: Google's manual for its unseen humans who rate the web

Technology? Yes, but also toiling home-workers

  • alert
  • submit to reddit

Gartner critical capabilities for enterprise endpoint backup

It's widely believed that Google search results are produced entirely by computer algorithms - in large part because Google would like this to be widely believed. But in fact a little-known group of home-worker humans plays a large part in the Google process. The way these raters go about their work has always been a mystery. Now, The Register has seen a copy of the guidelines Google issues to them.

The 160-page manual gives detailed advice for raters - on relevance, spamminess, and - more controversially - the elusive "quality". For relevance raters are advised to give a rating based on "Vital", "Useful", "Relevant", Slightly Relevant", "Off-Topic or Useless" or "Unratable".

Raters may also be asked to give a spam rating: "Not Spam", "Maybe Spam", "Spam", "Porn" and "Malicious".

Interestingly, raters are not advised to rate websites with out of date security certificates as Spam or Malicious. At the time the rating guide was written, the US army portal - for instance - currently used an out-of-date certificate.

Users are asked to second guess "user intent". "What was the user trying to accomplish when he typed this query?" asks the manual. Google classifies intentions into three categories: the first is "action intent" - a user wanting to "accomplish a goal or engage in an activity". Then there are what the Chocolate Factory calls "do queries" and navigational, or "go queries". They're not mutually exclusive, the guide stresses, and some are ambivalent: such as the search query "iPad".

Raters are advised to look for websites with content fresher than four months old - if it's older, it shouldn't be rated "Vital".

Much of this part of the guideline document is intended to cope with sites attempting to game Google. For example, this blog is cited as an example of "gibberish". Google's PageRank system was originally devised to rank authority according to popularity. This worked for academic papers, where frequently-cited documents, tended to be the most important. Other tweaks were then added. But the increasing popularity of weblogs in 2003 caused all kinds of problems for Google, as they gamed the PageRank algorithm so effectively: creating a rats nest of links.

By 2006, automated tools could create hundreds of blogs in just a few minutes - see our contemporary interview with the author of 'Blog Mass Installer' - populating them with machine-generated content that even humans found hard to distinguish from a human-generated site. This also posed an ethical business dilemma for Google, which had begun to grow rapidly from low-cost keyword search advertising placed on blogs. Google needed the blogs to help it grow, as each blog was a potential advertising space. But it couldn't afford to populate the search results with low quality, spammy blog results.

It's actually a reminder of how tricky it is to create good search results. What appears obvious to us - that a chain of hotels for pets is not suitable for a search query "hotels" - is not obvious to an algorithm. But isn't a pet hotel part of the web's rich tapestry, too? It's a deeply subjective decision. Here's where humans come in: it's astonishing to think such a decision isn't a subjective human choice - and a sign that we childishly believe computers are magic.

Google joked that trained pigeons rate the web. In fact, it's humans.

Google's human raters must also make decisions on pornographic material. Here, too, the Google Rater has to decide what the searcher's intention is. The example of "spanking" is cited: information on parents spanking children from the University of Maine is regarded as "relevant", a page about spanking fetish is "Slightly Relevant" and triggers the Porn flag. Porn is still deemed relevant - just not so much.

"Please do not assign a Porn flag to a non-porn page, just because the query has porn intent. If the landing page is not porn, it should not be flagged", says the guide.

But a subjective rating isn't all that there is. In addition to relevance, there's Page Quality - and that's a far more controversial and ambivalent yardstick.

Raters are invited to infer a website's reputation. For example, Google asks Raters: "What kind of Reputation Does the Website Have? ... negative or malicious reputation ... Mixed reputation ... Positive or OK reputation ... little or not information found ..."

It goes on to explain:

"Reputation research in Page Quality rating is very important. A positive reputation from a consensus of experts is often what distinguishes an overall Highest quality page from a High quality page. A negative reputation should not be ignored and is a reason to give an overall Page Quality rating of Low or Lowest."

It's controversial for a number of reasons. The web isn't a reliable feedback system - anonymous complaints are noisy and rife, and may not be representative. A site's detractors may also be motivated by an agenda that isn't obvious to a rater. And the Google advice to look for "a consensus of experts" doesn't always help. It depends on who the "experts" are. As an example, some academics - such as Evgeny Morozov - have already called for search engines to put warnings by climate sites that disagree with the "consensus" - fully entering into the editorial process.

Google is sensitive to the accusation that contractors could game the system. Matt Cutts insisted last year that "even if multiple search quality raters mark something as spam or non-relevant, that doesn't affect a site's rankings or throw up a flag". So, Google employs a network of site raters, devises a complex manual for them to follow, then ignores their judgements?

Who are the Raters?

Google's outsources the ratings to contractors Leapforce and Lionbridge, who employ home workers. Lionbridge describes itself as a "global crowdsourcing" agency and lists the advertisements here. According to one Leapforce job ad there are 1,500 raters. The work is flexible but demanding - raters must pass an examination and are consistently evaluated by Google. For example, a rater is given a "TTR" score - "Time to Rate" measures how quickly they make their decisions. Here's one contractor's tale, and an interview at SEO site SearchEngineLand with another.

It's amazing how the image Google likes to promote - and politicians believe - one of high tech boffinry and magical algorithms, contrasts with the reality. Outsourced home workers are keeping the machine running. Glamorous, it isn't. ®

Boost IT visibility and business value

More from The Register

next story
6 Obvious Reasons Why Facebook Will Ban This Article (Thank God)
Clampdown on clickbait ... and El Reg is OK with this
No, thank you. I will not code for the Caliphate
Some assignments, even the Bongster decline must
Fast And Furious 6 cammer thrown in slammer for nearly three years
Man jailed for dodgy cinema recording of Hollywood movie
Caught red-handed: UK cops, PCSOs, specials behaving badly… on social media
No Mr Fuzz, don't ask a crime victim to be your pal on Facebook
Barnes & Noble: Swallow a Samsung Nook tablet, please ... pretty please
Novelslab finally on sale with ($199 - $20) price tag
Ballmer leaves Microsoft board to spend more time with his b-balls
From Clippy to Clippers: Hi, I see you're running an NBA team now ...
Video of US journalist 'beheading' pulled from social media
Yanked footage featured British-accented attacker and US journo James Foley
Assange™: Hey world, I'M STILL HERE, ignore that Snowden guy
Press conference: ME ME ME ME ME ME ME (cont'd pg 94)
Call of Duty daddy considers launching own movie studio
Activision Blizzard might like quality control of a CoD film
prev story

Whitepapers

Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Endpoint data privacy in the cloud is easier than you think
Innovations in encryption and storage resolve issues of data privacy and key requirements for companies to look for in a solution.
Scale data protection with your virtual environment
To scale at the rate of virtualization growth, data protection solutions need to adopt new capabilities and simplify current features.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?