Feeds

Is Google legal?

Big in Belgium

Providing a secure and efficient Helpdesk

Analysis A Belgian court ruled against Google’s use of newspaper stories in early September. If you believe Google, it did nothing wrong and failed to defend itself because it was unaware of the publishers’ lawsuit. If you believe the publishers, Google is lying and infringes copyright on a colossal scale. The parties return to court on 23rd November in a case that finds legal uncertainty looming over the world’s leading search engines.

The case focused on Google’s news aggregation service, which automatically scans the websites of newspapers, extracting headlines and snippets of text from each story. These are displayed at Google News and the headlines link users to the full stories on the source sites. Newspaper group Copiepresse, which represents leading Belgian, French and German publications, said this amounted to copyright infringement and a breach of database rules because its members had not been asked for permission.

Copiepresse could have stopped Google without going to court but chose not to. Instead, it wants Google to continue directing traffic to its sites – and it wants Google to pay for the privilege.

The court also ruled that Google’s cache, which is not part of Google News, infringed copyright.

When a person performs a search at Google, results are displayed with a link to the page on the third party site and also a link to a ‘cached’ copy of the same page stored at Google’s own site. The newspapers say this copy undermines their sale of archive stories. Why buy an archived story if you can find it in Google’s cache? Again, newspapers could have stopped their pages being cached.

Margaret Boribon, Secretary General of Copiepresse, told OUT-LAW that Google’s behaviour is “totally illegal” because it does not seek permission before extracting content for Google News or copying pages to its cache. Google disagrees.

Understanding Google’s position within the law means understanding how the search engine works.

Google uses an automated program to crawl across the internet, known as its Googlebot. It locates billions of pages and copies each one to its index. In doing so it breaks the page into tiny pieces, analysing and cross-referencing every element. That index is what Google interrogates to return search results for users. When the Googlebot visits a page, it also takes a snapshot that is stored in Google’s cache, a separate archive that lets users see how a page looked the last time the Googlebot visited.

It is easy for a website to keep Googlebot or other search engine robots away from all or particular pages. A standard has existed since 1994 called the robots exclusion standard.

Add ‘/robots.txt’ to the end of any site’s web address and you’ll find that site’s instructions for search engines. Google also offers a simple way to prevent a page being cached: just write the word ‘NOARCHIVE’ in the code of a page.

When asked why her members’ news sites didn’t follow these steps to exclude Google, Boribon replied, "then you admit that their reasoning is correct". She said all search engines should obtain permission before indexing pages that carry copyright notices.

But the real reason for not opting-out with a robots.txt file or mandating against caching is that Belgium’s newspapers want to be indexed by Google. “Yes, we have a problem with Google, but we don’t want to be out of Google,” Boribon said. “We want Google to respect the rules. If Google wanted to index us, they need to ask.”

Copiepresse also wants Google to pay for indexing sites. Boribon declined to discuss how or how much. "That has to be negotiated," she said.

The argument is not unique. The World Association of Newspapers (WAN), which represents 18,000 newspapers in 102 countries, said in January it would “explore ways to challenge the exploitation of content by search engines without fair compensation to copyright owners.”

At that time, WAN did not have a strategy for challenge. Copiepresse did. It took direct action and convinced the Brussels Court of First Instance to order Google to withdraw from its sites all the articles and photographs of Copiepresse member sites. Google was given 10 days to comply with the threat of a €1 million fine for each day of delay.

Secure remote control for conventional and virtual desktops

More from The Register

next story
Facebook pays INFINITELY MORE UK corp tax than in 2012
Thanks for the £3k, Zuck. Doh! you're IN CREDIT. Guess not
Happiness economics is bollocks. Oh, UK.gov just adopted it? Er ...
Opportunity doesn't knock; it costs us instead
YARR! Pirates walk the plank: DMCA magnets sink in Google results
Spaffing copyrighted stuff over the web? No search ranking for you
In the next four weeks, 100 people will decide the future of the web
While America tucks into Thanksgiving turkey, the world will be taking over the net
Microsoft EU warns: If you have ties to the US, Feds can get your data
European corps can't afford to get complacent while American Big Biz battles Uncle Sam
Don't bother telling people if you lose their data, say Euro bods
You read that right – with the proviso that it's encrypted
prev story

Whitepapers

Choosing cloud Backup services
Demystify how you can address your data protection needs in your small- to medium-sized business and select the best online backup service to meet your needs.
Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.
Reg Reader Research: SaaS based Email and Office Productivity Tools
Read this Reg reader report which provides advice and guidance for SMBs towards the use of SaaS based email and Office productivity tools.
Storage capacity and performance optimization at Mizuno USA
Mizuno USA turn to Tegile storage technology to solve both their SAN and backup issues.