Tumblr resorts to AI in attempt to scrub itself clean from filth

But sorting out the good boobs from bad boobs isn't easy

Tumblr is hoping to automatically cleanse its social media platform of explicit pornographic content, genitals and female nipples, by using machine learning algorithms.

Jeff D’Onofrio, Tumblr's CEO, announced it would be rolling out a system to flag content uploaded to the site, as well as removing existing smut, from 17 December onwards. The move comes after the Tumblr’s iOS app was banned from Apple’s iOS App Store last month for distributing child pornography.

Tumblr has defined adult content as “photos, videos, or GIFs that show real-life human genitals or female-presenting nipples, and any content—including photos, videos, GIFs and illustrations—that depicts sex acts.”

There are exceptions, however. Pictures of breasts are allowed on the site if it’s related to breastfeeding, childbirth, or surgical operations for cases of mastectomy or gender reassignment.

It may sound like a good idea at first, but there are downsides. Quite a lot of content on Tumblr is, or was, pornographic, and so this move will wipe away a large chunk of the platform, and its visitors, while other stuff, such as pages lauding Nazism, is allowed to stay. It may be an overreaction to being booted out of the iOS for carrying child sex abuse imagery.

Then there's the technology involved. Os Keyes, a PhD student at the University of Washington in the US, who is studying gender and algorithms, told The Register: “Using a binary classifier to identify masculine and feminine nipples for the purposes of cutting out porn is ludicrous.

“What does a classifier for porn based on nipples even look like? That is, assuming it 100% accurately detects nipples in the first place – what makes a masculine nipple? What makes a feminine nipple? Hair presence? areola size? Breast structure? Whichever heuristics you pick, if you're looking specifically at nipples then you are not measuring what you think you're measuring – which is how the image overall would be gendered.”

We asked Tumblr for comment on about how its classification algorithm worked, and how it was trained and tested. We did not receive an answer.

Leaking information

Tumblr turns stumblr, left humblr: Blogging biz blogs bloggers' private info to world+dog

READ MORE

The final decision to take down an image is left to human moderators so as to prevent false positives, according to a statement. “This work requires a mix of machine-learning classification and human moderation by our Trust & Safety team—the group of individuals who help moderate Tumblr. When you appeal a post we’ve marked as adult, it gets sent to a real, live human who will look it over with their real, live human eye(s),” it said.

Keyes is still skeptical despite having a human team to fall back on. “In the case of permitting medical imagery; medical versus pornographic is a relatively subjective thing. Whose idea of ‘medical’ is being applied, here? And what happens when they get things wrong? Humanity is diverse; these kinds of classifiers, which demand that what they're looking for be rigidly categorisable, cannot be.”

In fact, users are sharing images that Tumblr has mistakenly flagged on Twitter using the hashtag #toosexyfortumblr. Some of these include innocent pictures of dogs and cats, design patents for a device that scrubs boots, jeans with embedded LED lights, and even a picture of a hand. The software also, reportedly, takes a dim view of anything non-heteronormative, somewhat upsetting LGBQT communities.

Good job, AI... ®




Biting the hand that feeds IT © 1998–2018