Feeds

Google algorithm change squashes code geek 'webspam'

More Stackoverflow, please

Beginner's guide to SSL certificates

Google has rolled out an update to its search algorithms designed to reduce "webspam", aka "the junk you see in search results when websites try to cheat their way into higher positions in search results or otherwise violate search engine quality guidelines".

In short, says Google principal engineer and search quality guru Matt Cutts, the company's search engine will show more preference to sites that generate original content, as opposed to sites that lift content from elsewhere. Google is pushing back against so-called content farms – at least a little. The algorithm change affects a relatively small number of search results. According to Cutts, searchers will "notice" the change on less than 0.5 per cent of queries.

A week ago, in response to several stories complaining of Google search spaminess, Matt Cutts unloaded a blog post defending the company's search engine. "According to the evaluation metrics that we’ve refined over more than a decade, Google’s search quality is better than it has ever been in terms of relevance, freshness and comprehensiveness," he said. "Today, English-language spam in Google’s results is less than half what it was five years ago, and spam in most other languages is even lower than in English."

But at the same time, Cutts acknowledged a "slight uptick" in spam in recent months, and he said that Google was "evaluating multiple changes that should help drive spam levels even lower, including one change that primarily affects sites that copy others’ content and sites with low levels of original content." And on Friday, with a post to his personal blog, Cutts announced that this change went live earlier the week.

He said that the change would affect about two per cent of all Google search queries, but that users would actually notice something on less than 0.5 per cent of queries. "It's a pretty targeted launch," he said. "The net effect is that searchers are more likely to see the sites that wrote the original content rather than a site that scraped or copied the original site’s content."

In a post to Hacker News, Cutts mentions two programming-centric queries where the change comes into play: "pass json body to spring mvc" and "aws s3 emr pig". Apparently, both were giving preference to a site called efreedom that has copied content from stackoverflow.com, rather than promoting the original stackoverflow links. And now they don't.

"An example would be that stackoverflow.com will tend to rank higher than sites that just reuse stackoverflow.com's content," Cutts said. "Note that the algorithmic change isn't specific to stackoverflow.com though." But he did not give other examples. ®

Security for virtualized datacentres

More from The Register

next story
Phones 4u slips into administration after EE cuts ties with Brit mobe retailer
More than 5,500 jobs could be axed if rescue mission fails
Israeli spies rebel over mass-snooping on innocent Palestinians
'Disciplinary treatment will be sharp and clear' vow spy-chiefs
Apple CEO Tim Cook: TV is TERRIBLE and stuck in the 1970s
The iKing thinks telly is far too fiddly and ugly – basically, iTunes
Huawei ditches new Windows Phone mobe plans, blames poor sales
Giganto mobe firm slams door shut on Microsoft. OH DEAR
Phones 4u website DIES as wounded mobe retailer struggles to stay above water
Founder blames 'ruthless network partners' for implosion
Found inside ISIS terror chap's laptop: CELINE DION tunes
REPORT: Stash of terrorist material found in Syria Dell box
Show us your Five-Eyes SECRETS says Privacy International
Refusal to disclose GCHQ canteen menus and prices triggers Euro Human Rights Court action
prev story

Whitepapers

Providing a secure and efficient Helpdesk
A single remote control platform for user support is be key to providing an efficient helpdesk. Retain full control over the way in which screen and keystroke data is transmitted.
Saudi Petroleum chooses Tegile storage solution
A storage solution that addresses company growth and performance for business-critical applications of caseware archive and search along with other key operational systems.
Security and trust: The backbone of doing business over the internet
Explores the current state of website security and the contributions Symantec is making to help organizations protect critical data and build trust with customers.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.