Feeds

Gone in 30 minutes: Chinese tweets purged by army of censors

New report claims thousands of censors could be working for Sina Weibo

High performance access to file storage

The murky world of online self-censorship in China has come under the spotlight again in a new report which estimates that most post deletions on the Twitter-like Sina Weibo occur within the first 30 minutes of appearing.

The Velocity of Censorship: High-Fidelity Detection of Microblog Post Deletions, was researched by academics at Bowdoin College, Rice University and the University of New Mexico alongside independent researcher Tao Zhu(h/t MIT Technology Review).

Sina claims its service has over 500 million users, but for the purposes of this research the team concentrated on the posts of around 3,500 “sensitive” users with a track record of censorship.

Developing a system “which collects removed posts on targeted users in almost real time”, the researchers found that roughly 12 per cent of posts were deleted over the 15 day monitoring period – which amounts to more than 4,500 every day.

The research included the following observation(PDF):

Our research found that deletions happen most heavily in the first hour after a post has been made. Especially for original posts that are not reposts, most deletions occur within 5-30 minutes, accounting for 25 per cent of the total deletions of such posts. Nearly 90 per cent of the deletions of such posts happen within the first 24 hours of the post.

To enable such speedy censorship, the report claims a mixture of technical and non-technical filtering is used, with potentially thousands of staff employed to eyeball content, as per the following hypothesis:

The deletions happen most heavily for a regular post within 5 to 10 minutes of it being posted. Suppose an efficient worker can read 50 posts per minute, including the reposts and figures included in the posts. Then to read Weibo’s full 70,000 new posts in one minute, 1,400 workers working at the same time would be needed. If these workers only worked in 8 hour shifts, 4,200 workers would then be required.

Proactive keyword filtering blocks certain posts before they have gone live, or holds them for human review, while a range of retroactive mechanisms including backwards keyword and repost searches, public timeline filtering and monitoring of specific censorship-prone individuals were also highlighted in the report.

The research also hypothesises that the censors work “relatively independently, in a distributed fashion”, with activity only really dipping between around 1-7am and again slightly at 7pm – which the report authors claim could be due to the national TV news programme broadcast at that time.

Although the report casts new light on the speed and accuracy of China’s web censors, it doesn’t explain why more isn’t done to block potentially illegal content before it is even posted.

One possible answer came from Sina Weibo manager @geniune_Yu_Yang (正版于洋), who – apparently frustrated by user anger directed at the company’s army of censors - wrote an illuminating post of his own back in January.

He effectively argued that Sina is trying to work around the strict regulations forced upon it by government, by at least letting users see and disseminate their content for a few minutes before it is deleted.

He wrote:

You can see the messages before they are deleted, right? You still have your account functioning, right? You are all experienced netizens, you know that the technology allows us to delete messages in a second. Please think carefully on this.

Now, there is no way of proving whether this manager was engaging in a crafty piece of well-timed PR or if there’s some truth to his claims.

Somewhat ironically, his post too was deleted, which illustrates perfectly the central problem with censorship of this kind: there's no way of telling whether a piece of content is deleted because it was true, or because it wasn't. ®

High performance access to file storage

More from The Register

next story
Android engineer: We DIDN'T copy Apple OR follow Samsung's orders
Veep testifies for Samsung during Apple patent trial
MtGox chief Karpelès refuses to come to US for g-men's grilling
Bitcoin baron says he needs another lawyer for FinCEN chat
Did a date calculation bug just cost hard-up Co-op Bank £110m?
And just when Brit banking org needs £400m to stay afloat
One year on: diplomatic fail as Chinese APT gangs get back to work
Mandiant says past 12 months shows Beijing won't call off its hackers
EFF: Feds plan to put 52 MILLION FACES into recognition database
System would identify faces as part of biometrics collection
Big Content goes after Kim Dotcom
Six studios sling sueballs at dead download destination
Ex-Tony Blair adviser is new top boss at UK spy-hive GCHQ
Robert Hannigan to replace Sir Iain Lobban in the autumn
Alphadex fires back at British Gas with overcharging allegation
Brit colo outfit says it paid for 347KVA, has been charged for 1940KVA
Jack the RIPA: Blighty cops ignore law, retain innocents' comms data
Prime minister: Nothing to see here, go about your business
Don't let no-hire pact suit witnesses call Steve Jobs a bullyboy, plead Apple and Google
'Irrelevant' character evidence should be excluded – lawyers
prev story

Whitepapers

Securing web applications made simple and scalable
In this whitepaper learn how automated security testing can provide a simple and scalable way to protect your web applications.
Five 3D headsets to be won!
We were so impressed by the Durovis Dive headset we’ve asked the company to give some away to Reg readers.
HP ArcSight ESM solution helps Finansbank
Based on their experience using HP ArcSight Enterprise Security Manager for IT security operations, Finansbank moved to HP ArcSight ESM for fraud management.
The benefits of software based PBX
Why you should break free from your proprietary PBX and how to leverage your existing server hardware.
Mobile application security study
Download this report to see the alarming realities regarding the sheer number of applications vulnerable to attack, as well as the most common and easily addressable vulnerability errors.