Ruin your co-developers' life with Mimic, the Unicode substitution tool

Don't try this if your co-workers have access to weapons

Lego Monster Fighters Lord Vampyre

This is an idea of superlative malice: a developer has posted a GitHub project that replaces ASCII characters in C# code with near-homoglyphs from the Unicode character set.

Nobody would miss the substitution if emoji started popping up in their code, but “Mimic” from Greg Toombs is more subtle than that.

His script, inspired by a Tweet suggesting that the Greek question mark in Unicode “;” (U037E) is so close to the ASCII semicolon “;” that the bugs C# would raise would be nearly impossible to identify.

So Toombs took the idea further, and created Mimic on the premise that “There are many more characters in the Unicode character set that look, to some extent or another, like others – homoglyphs. Mimic substitutes common ASCII characters for obscure homoglyphs.”

He provides the following examples of the abuse of Unicode:

First Mimic example - Greg Toombs

"Or," he writes, "if you've been Mimicked a little harder:"

Second Mimic example - Greg Toombs

The implications go far beyond mere pranksterism, however: Toombs notes that Unicode substitutions can be used to evade indexing or censorship, get phrases past spam filters, or to hide plagiarism (since substitution in stolen code would make it hard for auto-detection software to pick up the copying).

There are defences – here Toombs lists troll-stoppers for Vim, Emacs and Atom.

And let's not forget to lay blame at the feet of developer and author Peter Ritchie, whose November 2014 Tweet planted the seed of the idea for Mimic.

To quote Prostetnic Vogon Jeltz from the Hitchhiker's Guide to the Galaxy: “Death's too good for them”. ®

Sponsored: What next after Netezza?


Biting the hand that feeds IT © 1998–2019