Despite the headlines, Rudd's online terror takedown tool is only part of the solution
UK.gov schtum on false positives, appeals process and long-term impact
Analysis The UK government launched its swish new tool to fight online extremist content to much fanfare this week – leading news bulletins and generating reams of coverage. But it also faced a whole host of criticism and concern.
Underneath the claims made by Home Secretary Amber Rudd yesterday, there are many questions yet to be answered and a wealth of implications the government has failed to set out – instead preferring to demonstrate to the public that, if Silicon Valley won't take action, it will step in.
The tool, developed by ASI Data Science on a shoestring budget of £600,000, is pitched as a pre-filter system, meaning all flagged videos will be sent for human review before publication, rather than waiting until it is online to pull it down. (This approach has been pushed by PM Theresa May but condemned by campaigners as another form of censorship.)
The algorithm was trained to recognise traits particular to Daesh videos. ASI co-founder Marc Warner told The Register it had been trained and tested on three sets of videos. Some 1,300 Daesh videos provided by the Home Office; about the same number of borderline videos that might trick the system, like legitimate news coverage; and 100,000 randomly selected YouTube videos.
He said the business had balanced false positives with performance to tune the algorithm to be able to detect 94 per cent of Daesh propaganda with a 99.995 per cent accuracy.
Accuracy versus false positives
The government PR machine and a number of news outlets have made much of these high accuracy levels, but you need real numbers to put them into context.
Assume there are 100 Daesh videos uploaded to a platform, among a batch of 100,000 vids that are mostly cat videos and beauty vlogs. The algorithm would accurately pick out 94 terror videos and miss six, while falsely identifying five. Some people might say that's a fair enough trade-off.
But if it is fed with 1 million videos, and there are still only 100 Daesh ones in there, it will still accurately pick out 94 and miss six – but falsely identify 50.
So if the algorithm was put to work on one of the bigger platforms like YouTube or Facebook, where uploads could hit eight-digit figures a day, the false positives could start to dwarf the correct hits. And this is where much of the Twitterati's concerns have focused.
About 6 million YouTube videos uploaded per day. If 99.995% accurate (And that's a *marketing* claim, remember) then that's about 300 mistakes per day.— Zoe O'Connell (@zoeimogen) February 13, 2018
No idea how often ISIS publish videos - once a day? So you're blocking 300 videos for every actual match.
Huge overkill. https://t.co/Q6UP9FP8B5
However, Warner told El Reg that focusing on the big firms was "kind of to miss the point of this work".
Rather, the idea is to offer a ready-made tool to smaller platforms that lack the technical expertise or resources to develop their own or invest in large-scale human review teams.
Warner pointed to a Home Office analysis that found 400 platforms hosting Daesh content – even if the big three or four firms get their acts into shape, hundreds of platforms will be left.
"Really, we want to make it available to the smaller firms," he said.
The government's press release name-checks three smaller companies it wants to bully into action "support": pCloud, Telegra.ph and Vimeo. The latter has around 43,000 uploads a day – meaning the tool would wrongly flag up about 20 videos each day.
Whether this balances out depends on how many videos Daesh chooses to upload to a platform of this size, but ASI would argue that 20 is a low enough number for one person to comfortably vet – without the firm having to hire a huge team like the Zuckerborg has, or rely on users flagging it up after publication.
Error, review and appeal
The government's focus on getting small companies to use the tool also raises questions about the review process – if someone's video is incorrectly listed as extremist content, what recourse do they have?
"If material is incorrectly removed, perhaps appealed, who is responsible for reviewing any mistake? It may be too complicated for the small company," said Jim Killock, director of the Open Rights Group.
"If the government want people to use their tool, there is a strong case that the government should review mistakes and ensure that there is an independent appeals process."
He added that errors are also likely to go unreported because people are incentivised to accept removal of their content.
"To complain about a takedown would take serious nerve, given that you risk being flagged as a terrorist sympathiser, or perhaps having to enter formal legal proceedings."
Real-world use and gaming
Because this is a machine-learning system, the accuracy rates and false positives will change – for better or worse – when it's out in the wild.
Warner said this was something that needed to be monitored, but added that "how it's used in the real world depends on what the Home Office want to do with it" and how it offers the tool to firms.
Its accuracy will also depend on Daesh's ability to work around it – perhaps by avoiding posting on platforms that use the tech or by developing new styles of propaganda videos.
Although Warner stressed that the methodology would only be given to firms approved by the Home Office - "not dumped on the internet" - it is possible that Daesh will be able to understand how it works.
"If they're sharing the 'methodology', and letting small platforms use the tool, you can be pretty sure that IS and other extremists will get hold of this tool and find ways around it," said Paul Bernal, a privacy and IT expert at the University of East Anglia.
"If they only share it with 'trusted' platforms, that will presumably defeat the purpose – the extremists will use the less trusted platforms with relative impunity."
Alternatively, the move could push groups further underground, to platforms that are harder for authorities to surveil and isolate suspected terrorists.
"We may regret chasing extremists off major platforms, where their activities are in full view and easily used to identify activity and actors," said Killock.
It's possible that presenting the tool as a magic wand solution to extremist content online could create the perception that fixing these problems is simple.
This benefits the government by ramping up the pressure on the tech firms they are constantly at loggerheads with, but could leave the public thinking tech can address other societal problems (which it invariably shouldn't have responsibility for).
"We need to be very wary of slippery slopes here," said Bernal. "Using tools like this for extremist content might make people more enthusiastic for similar tools in other areas such as hate speech and trolling, for which they're far less suitable and likely to produce many false positives." ®
Sponsored: Beyond the Data Frontier