How to nuke websites you don't like: Slam Google with millions of bogus DMCA takedowns
Copyright allegations wreck search rankings
Analysis Big corporations are abusing the system for taking down files and links to copyright-infringing content by sending millions of fake links, according to Google.
Under the Digital Millennium Copyright Act (DMCA), online providers are given legal protection from having illegal content on their servers if they respond to requests from the copyright owners to take down specific files.
It has never been a watertight system, but a recent filing [PDF] from Google to a formal study looking at the process (section 512 in the lingo) shows that in recent years it has become widely abused, with companies filing millions of bogus requests in an effort to shut down specific websites.
"A significant portion of the recent increases in DMCA submission volumes for Google Search stem from notices that appear to be duplicative, unnecessary, or mistaken," the ad-serving giant wrote in its formal response earlier this week.
"A substantial number of takedown requests submitted to Google are for URLs that have never been in our search index, and therefore could never have appeared in our search results."
In other words, companies are using Google's DMCA takedown service to request the web giant remove links that don't actually exist. And it isn't a small number either.
"For example, in January 2017, the most prolific submitter submitted notices that Google honored for 16,457,433 URLs. But on further inspection, 16,450,129 (99.97 per cent) of those URLs were not in our search index in the first place."
Basically, just over 7,000 of the over 16 million requests made by one company in a single month were real. And if you think it might be just one company doing it, you'd be wrong. Again, according to Google, "in total, 99.95 per cent of all URLs processed from our Trusted Copyright Removal Program in January 2017 were not in our index."
What appears to be happening is that intellectual property lawyers are taking whatever new movie, album, single or book exists, and copying the URL structure of thousands of different websites to send a takedown request – whether or not that file actually exists.
An inspection of these requests shows that they are clearly automated requests, as the same syntax is used over and over again for different websites and even the same websites, with every possible variation and combination included in a scattergun effort to hit on a real URL.
For example, one request sent today, February 23, to Google concerned the The Billionaire's Obsession series of books from JS Scott. It lists 111 different URLs, all of which end with the search URL "/search/all/J-S-Scott-The-Billionaire-s-Obsession-epub/" but with 111 different domain names in front of it.
That request is one of 139 sent today alone by a single company (MUSO.com Anti-piracy).
Which of course raises the question: Why? And the answer is three-fold:
- There is no reason not to. Once a company has created an automated script to throw out URLs and send them automatically to Google, it is extremely easy and fast to do so. But perhaps more importantly, there is no mechanism for punishing abuse of the system. Companies can send millions of requests and there is no comeback. They can send millions the next day. And the next.
- They will occasionally get one right. A 0.03 per cent success rate would ruin any other business, even spammers, but to IP lawyers, getting any positive result ever is a good outcome. Especially since they are paid by the hour.
- It focuses Google's attention on specific websites. In one respect, Google is to blame for this abuse of the system. In talking about its system for handling copyright infringement and DMCA takedowns, the company's legal director for copyright, Fred von Lohmann, told a Congressional hearing on the copyright issue not only that Google "relies on copyright owners to inform us" of infringing material, but that "Google has been demoting sites based on the number of takedown notices they receive from copyright owners." It would have taken big corporations' IP lawyers about three seconds to realize that sending millions of requests – even completely fake ones – for particular websites was likely to achieve their main goal of downgrading them from the first few pages of a Google search. And so that's what they have done.
Of course, none of this is contributing to a healthy solution to copyright infringement. But what is the solution?
Well, that is what the United States Copyright Office is trying to figure out with its Section 512 study. Google sent its response to the second round of the study on the last day of its deadline. The same day, the government agency extended its comment period to March 22, 2017.
There is no shortage of contributions or ideas for improving the system. What is clear is that the current system is a mess. But then what else is new? This is content on the internet we're talking about. ®
PS: A 2016 study linked to Google suggested only one in 25 takedown requests wrongly identifies the material. Curious!