Feeds

Google algorithm change squashes code geek 'webspam'

More Stackoverflow, please

Security for virtualized datacentres

Google has rolled out an update to its search algorithms designed to reduce "webspam", aka "the junk you see in search results when websites try to cheat their way into higher positions in search results or otherwise violate search engine quality guidelines".

In short, says Google principal engineer and search quality guru Matt Cutts, the company's search engine will show more preference to sites that generate original content, as opposed to sites that lift content from elsewhere. Google is pushing back against so-called content farms – at least a little. The algorithm change affects a relatively small number of search results. According to Cutts, searchers will "notice" the change on less than 0.5 per cent of queries.

A week ago, in response to several stories complaining of Google search spaminess, Matt Cutts unloaded a blog post defending the company's search engine. "According to the evaluation metrics that we’ve refined over more than a decade, Google’s search quality is better than it has ever been in terms of relevance, freshness and comprehensiveness," he said. "Today, English-language spam in Google’s results is less than half what it was five years ago, and spam in most other languages is even lower than in English."

But at the same time, Cutts acknowledged a "slight uptick" in spam in recent months, and he said that Google was "evaluating multiple changes that should help drive spam levels even lower, including one change that primarily affects sites that copy others’ content and sites with low levels of original content." And on Friday, with a post to his personal blog, Cutts announced that this change went live earlier the week.

He said that the change would affect about two per cent of all Google search queries, but that users would actually notice something on less than 0.5 per cent of queries. "It's a pretty targeted launch," he said. "The net effect is that searchers are more likely to see the sites that wrote the original content rather than a site that scraped or copied the original site’s content."

In a post to Hacker News, Cutts mentions two programming-centric queries where the change comes into play: "pass json body to spring mvc" and "aws s3 emr pig". Apparently, both were giving preference to a site called efreedom that has copied content from stackoverflow.com, rather than promoting the original stackoverflow links. And now they don't.

"An example would be that stackoverflow.com will tend to rank higher than sites that just reuse stackoverflow.com's content," Cutts said. "Note that the algorithmic change isn't specific to stackoverflow.com though." But he did not give other examples. ®

Beginner's guide to SSL certificates

More from The Register

next story
Bono apologises for iTunes album dump
Megalomania, generosity and FEAR of irrelevance drove group to Apple deal
HBO shocks US pay TV world: We're down with OTT. Netflix says, 'Gee'
This affects every broadcaster, every cable guy
Facebook, Apple: LADIES! Why not FREEZE your EGGS? It's on the company!
No biological clockwatching when you work in Silicon Valley
SCREW YOU, EU: BBC rolls out Right To Remember as Google deletes links
Not even Google can withstand the power of Auntie
Arab States make play for greater government control of the internet
Nerds told to get lost in last-minute power grab bid at UN meeting
Apple SILENCES Bose, YANKS headphones from stores
The, er, Beats go on after noise-cancelling spat
Zippy one-liners, broken promises: Doctor Who on the Orient Express
Series finally hits stride, but Clara's U-turn is baffling
Don't bother telling people if you lose their data, say Euro bods
You read that right – with the proviso that it's encrypted
America's super-secret X-37B plane returns to Earth after nearly TWO YEARS aloft
674 days in space for US Air Force's mystery orbital vehicle
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Win a year’s supply of chocolate
There is no techie angle to this competition so we're not going to pretend there is, but everyone loves chocolate so who cares.
Why cloud backup?
Combining the latest advancements in disk-based backup with secure, integrated, cloud technologies offer organizations fast and assured recovery of their critical enterprise data.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Saudi Petroleum chooses Tegile storage solution
A storage solution that addresses company growth and performance for business-critical applications of caseware archive and search along with other key operational systems.