The Register® — Biting the hand that feeds IT

Feeds

Google, antitrust, and the 'Copygate' hypocrisy

'We can copy. But you can't'

Regcast training : Hyper-V 3.0, VM high availability and disaster recovery

Comment Google built a multi-billion-dollar advertising empire atop a service that does little more than copy information from other sources. And yet it chastises others when they do the copying.

It's an irony that could land the company in some very hot water.

Google made (countless) headlines last week when, after an intricate "sting operation", it accused Microsoft of "copying" its search results. Many missed the irony of "Copygate", but others quickly picked up on what can only be described as painfully obvious. It's not just that Google has made made multi-billions selling ads alongside content copied from across the web. The company has also been known to "copy" Bing's iconic background images – if only briefly. And we would argue it quite blatantly copied the iPhone in building Android.

No doubt, Google would deny such things. And even if it didn't, it would argue that Bing's copying is different. According to Google's sting, when netizens search Google on certain Internet Explorer browsers, there are cases where Microsoft automatically lifts the results and plugs them straight into its own search engine. Granted, this isn't the wisest move. At the very least, it gives miscreants an easy way of gaming Microsoft's search engine.

But the irony is still there. Google is a company built on all sorts of copying. With its Google Books project, Mountain View copied millions of library books without asking permission from the authors and publishers.

On the surface, this seems little more than a source of amusement. But we've seen this irony before. The same Google attitude plays a significant role in the European Union's ongoing antitrust investigation into the company's search and ad practices.

The European Commission is probing Google after receiving complaints from a trio of companies that includes Foundem, a UK-based vertical search engine that focuses on comparison shopping. Foundem accuses Google of "exploiting its dominance of search in ways that stifle innovation, suppress competition, and erode consumer choice".

The company's complaint makes two overarching claims. It says that in some cases, Google uses "discriminatory penalties" to remove sites from its search-results engine regardless of how relevant they are to a user's query, and it says that Google's Univerisal Search setup is unfairly promoting the company's own services – including Google Maps, YouTube, and Google Product Search – over those of its competitors.

In 2006, Google effectively removed Foundem from its "organic" search results, and all but barred the company from purchasing search ads on Google AdWords. For more than three years, Foundem fought for a return to Google's search engine, and Google obliged only after Foundem took its story public in late 2009. Despite being reinstated, Foundem went ahead with its EU complaint.

Before the EU formally announced its investigation in November, Google made light of Foundem's complaint without actually addressing the issues at hand, subtly criticizing the makeup of the company's site. But then, the day the probe was announced, the company rolled out a new tactic. A Google spokesman told us that Foundem's site was a problem because 79 per cent of its content is "duplicated" from other sites. And the company told The Guardian something similar, saying the site was de-indexed because about 87 per cent was "copied" from elsewhere.

There's that word again.

According to The Guardian, Google explained that a high level of copying "leads to automatic downgrading in its search results". This seems rather odd, however, when you consider that Google copies its content from elsewhere. Part of Google's defense of Universal Search is that it's not showing its own content, only the content of others. (This isn't true, but it's the company's defense nonetheless).

When we pointed out that he was criticizing Foundem for "un-original" content while arguing that Google was immune to criticism because of un-original content, the company spokesman told us that Google's situation was different. But he didn't exactly say how it was different.

Certainly, there are some types of copied content you don't want on a search results page. Just before accusing Microsoft of copying its search engine, Google rolled out a new algorithm designed to reduce "webspam". Google search guru Matt Cutts pointed to a pair of programming-centric queries where the change had an effect. Originally, both were giving preference to a site called efreedom that had copied content from stackoverflow.com. But after the change, the original stackoverflow links rose to the top.

This is only reasonable. It's welcome, in fact. A search engine, by design, should limit the sort of shamelessly pilfered content efreedom is throwing at people.

But in describing the webspam he was going after, Cutts used much of the same language Google has used to describe Foundem. "The algorithm change," Cutts said on his personal blog, "primarily affects sites that copy others’ content and sites with low levels of original content.”

This wasn't lost on Foundem, which – in a blog post of its own – was quick to point out that while some copied content is unwanted, other copied content can be very useful indeed. Foundem does copy a majority of its content, but it's a search engine. "Copying, organising, and presenting the content of others is a defining characteristic of any search service," Foundem said, "including Google’s own."

Google is well aware of the distinction between webspam and a vertical search engine. After all, the company offers its own price-comparison engine, Google Product Search, and it received prominent placement on the company's primary search engine thanks to Universal Search. And Google indexes various other vertical search engines, including – as of the end of 2009 – Foundem. This despite its somewhere between 74 and 87 per cent copied content, or whatever it is.

"The difference here is between service and content. Clearly, there are all kinds of services that aren’t required to author content," Foundem cofounder and CTO Adam Raff tells The Reg. "Google used to talk about a ‘lack of original content’, but lately seems to have made a strategic shift to calling it ‘copying content'," he said.

"'87 per cent of their content is copied from other sources', [Google will say of Foundem]," Raff told us. "They make it sound like a cheap form of spam where a site simply copies somebody else's content wholesale and runs Google ads on it to monetize it. But of course, for any legitimate search service, the vast majority of its content will have been copied from others."

Raff actually defends Google's claims that Microsoft is copying its search results, denying there's any hypocrisy at play. "The word ‘copy’ has a lot of different meanings," he says. "Using clickstream data to effectively copy a result from Google to Bing ... is, at the very least, a mistake. Apart from anything else, it’s a way to game Bing’s results."

The real hypocrisy, he says, lies elsewhere. "Another kind of copying altogether is the copying that any search engine does," he explains. "In the context of a search engine, this kind of copying is not only legitimate, it is essential. The real hypocrisy here is that Google has started attacking vertical search services by suggesting that this perfectly legitimate form of copying is somehow illegitimate for all vertical search services other than its own.”

It's a hypocrisy that may not sit well the European Commission. Google is keen to end the Commission's investigation, but it continues apace. ®

Cloud storage: Lower cost and increase uptime

Huh??

You say that you don't see how google is different to a site that is just referencing other material but if I were to search google for information on X and the first hit was www.google.com I'd be pretty pissed. I want the actual information on X not a site that can take me to the imformation on X.

The reality is they built a company on a service that lets you find information, not one that copies it (ignoring google books and friends which came along later). They didn't (as far as I am aware at least) go and mine the search results of the existing search engines to build their search index, and I don't see why others should be allowed to mine theirs. It's not like google has exclusive access to all the information needed to build a decent search engine.

In general that article seems pretty anti-google, and while there may be lots of reasons to not like google I hardly think this article presents one of them.

26
7

Dates for Android vs. iPhone

Android was purchased by Google in 2005 and was obviously in existence before that. The first iPhone was debuted in 2007. So I don't really see how Google copied Apple there. It was obvious that they had a long-term strategy to enter the mobile phone business, and that strategy paid off.

I agree with Mark 176; there are lots of reasons to not like various aspects of Google, and the few that are presented here are poorly executed.

I'd also add that anyone who understands search can tell the difference between indexing for search and copyright infringement and it appears that this article was trying to blur the line. That's not what I look for when I want editorials.

12
1

Misleading

This article sounds like a misleading hit piece. Since when is creating an index, the same as copying? If that were the case, I guess we would say that a library's card catalog is a copy of the works it indexes. Of course, this isn't the case. Search engines create an index. When you click on the links that they show, you are redirected to the owner's site, not a copy made by the search engine that directs you. The difference in this case is that Bing is *copying*. It is copying the *index* (not content) that Google has created for specific search terms by scrapping the results that users click on while using Google search in IE.

Consider a dictionary by Webster. It contains definitions for terms that everyone uses and knows that speaks that language. It is perfectly acceptable for Oxford dictionary to contain all the same terms. However, Webster needed expertise and considerable research in order to gather all those words. If Oxford were to just lift all the words and definitions from Webster's word index without it's own research, it would be obvious to most people that Oxford was offering nothing but a cheap knock off. What Bing is doing may be perfectly legal but it's still smells foul.

14
4

More from The Register

Thanks, NSA: Amazon sales of Orwell's 1984 rise 9,500%
Citizens of Oceania bone up on the new reality
Microsoft to open Windows Stores inside 600 Best Buy locations
Product showcases 'must be seen to be believed'
 breaking news
Author Iain (M) Banks falls to cancer at 59
Misses the release of his final work
 breaking news
What did the Lehman Brothers implosion look like to a techie?
Insider tells all about the Gnab Gib at Lehmans
It's official: 'tweet' an English word – not just in the avian sense
If the Oxford English Dictionary says it is so, then it is so
 breaking news
The only Waze is Google: Ad giant tipped to gobble map app 'for $1.3bn'
Pac-Man-satnav-ish upstart in bidding war with Apple, Facebook
 breaking news
1-in-10 e-tomes 'are self-published'... most are 'rubbish' says book ed
Publishing man scoffs at go-it-alone writers, ursines still fouling in forests
 breaking news
Facebook RSS reader said to uncloak June 20
Secret event scooped by Scottish developer?