Feeds

Is Google legal?

Big in Belgium

The essential guide to IT transformation

Analysis A Belgian court ruled against Google’s use of newspaper stories in early September. If you believe Google, it did nothing wrong and failed to defend itself because it was unaware of the publishers’ lawsuit. If you believe the publishers, Google is lying and infringes copyright on a colossal scale. The parties return to court on 23rd November in a case that finds legal uncertainty looming over the world’s leading search engines.

The case focused on Google’s news aggregation service, which automatically scans the websites of newspapers, extracting headlines and snippets of text from each story. These are displayed at Google News and the headlines link users to the full stories on the source sites. Newspaper group Copiepresse, which represents leading Belgian, French and German publications, said this amounted to copyright infringement and a breach of database rules because its members had not been asked for permission.

Copiepresse could have stopped Google without going to court but chose not to. Instead, it wants Google to continue directing traffic to its sites – and it wants Google to pay for the privilege.

The court also ruled that Google’s cache, which is not part of Google News, infringed copyright.

When a person performs a search at Google, results are displayed with a link to the page on the third party site and also a link to a ‘cached’ copy of the same page stored at Google’s own site. The newspapers say this copy undermines their sale of archive stories. Why buy an archived story if you can find it in Google’s cache? Again, newspapers could have stopped their pages being cached.

Margaret Boribon, Secretary General of Copiepresse, told OUT-LAW that Google’s behaviour is “totally illegal” because it does not seek permission before extracting content for Google News or copying pages to its cache. Google disagrees.

Understanding Google’s position within the law means understanding how the search engine works.

Google uses an automated program to crawl across the internet, known as its Googlebot. It locates billions of pages and copies each one to its index. In doing so it breaks the page into tiny pieces, analysing and cross-referencing every element. That index is what Google interrogates to return search results for users. When the Googlebot visits a page, it also takes a snapshot that is stored in Google’s cache, a separate archive that lets users see how a page looked the last time the Googlebot visited.

It is easy for a website to keep Googlebot or other search engine robots away from all or particular pages. A standard has existed since 1994 called the robots exclusion standard.

Add ‘/robots.txt’ to the end of any site’s web address and you’ll find that site’s instructions for search engines. Google also offers a simple way to prevent a page being cached: just write the word ‘NOARCHIVE’ in the code of a page.

When asked why her members’ news sites didn’t follow these steps to exclude Google, Boribon replied, "then you admit that their reasoning is correct". She said all search engines should obtain permission before indexing pages that carry copyright notices.

But the real reason for not opting-out with a robots.txt file or mandating against caching is that Belgium’s newspapers want to be indexed by Google. “Yes, we have a problem with Google, but we don’t want to be out of Google,” Boribon said. “We want Google to respect the rules. If Google wanted to index us, they need to ask.”

Copiepresse also wants Google to pay for indexing sites. Boribon declined to discuss how or how much. "That has to be negotiated," she said.

The argument is not unique. The World Association of Newspapers (WAN), which represents 18,000 newspapers in 102 countries, said in January it would “explore ways to challenge the exploitation of content by search engines without fair compensation to copyright owners.”

At that time, WAN did not have a strategy for challenge. Copiepresse did. It took direct action and convinced the Brussels Court of First Instance to order Google to withdraw from its sites all the articles and photographs of Copiepresse member sites. Google was given 10 days to comply with the threat of a €1 million fine for each day of delay.

Gartner critical capabilities for enterprise endpoint backup

More from The Register

next story
6 Obvious Reasons Why Facebook Will Ban This Article (Thank God)
Clampdown on clickbait ... and El Reg is OK with this
Mozilla's 'Tiles' ads debut in new Firefox nightlies
You can try turning them off and on again
No, thank you. I will not code for the Caliphate
Some assignments, even the Bongster decline must
Barnes & Noble: Swallow a Samsung Nook tablet, please ... pretty please
Novelslab finally on sale with ($199 - $20) price tag
Banking apps: Handy, can grab all your money... and RIDDLED with coding flaws
Yep, that one place you'd hoped you wouldn't find 'em
TROLL SLAYER Google grabs $1.3 MEEELLION in patent counter-suit
Chocolate Factory hits back at firm for suing customers
Primetime precrime? Minority Report TV series 'being developed'
I have to know. I have to find out what happened to my life
Netflix swallows yet another bitter pill, inks peering deal with TWC
Net neutrality crusader once again pays up for priority access
prev story

Whitepapers

Top 10 endpoint backup mistakes
Avoid the ten endpoint backup mistakes to ensure that your critical corporate data is protected and end user productivity is improved.
Implementing global e-invoicing with guaranteed legal certainty
Explaining the role local tax compliance plays in successful supply chain management and e-business and how leading global brands are addressing this.
Backing up distributed data
Eliminating the redundant use of bandwidth and storage capacity and application consolidation in the modern data center.
The essential guide to IT transformation
ServiceNow discusses three IT transformations that can help CIOs automate IT services to transform IT and the enterprise
Next gen security for virtualised datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.