IBM dissects the DNA of spam
Feng shui and genetics fight junk mail
IBM is applying ideas developed in sequencing DNA molecules to the detection of spam. Spammers have taken to inserting streams of gobbledegook or deliberately misspelling words in their spam messages in order the throw off anti-spam filters that rely on Bayesian statistical analysis alone.
In response, IBM is developing more sophisticated anti-spam filters. Boffins at Big Blue hit on the idea that programs used for looking for recurring patterns in DNA sequences could be applied to look for recurring phrases that often feature in junk mail missives. It developed a program called Chung-Kwei (named after a feng-shui talisman that protects homeowners against evil spirits) and trained it to spot repeated patterns in spam messages. IBM then fed a series of legitimate messages through the program in order to eliminate repeated patterns of messages that were common between both spam and 'ham' (legitimate) messages.
New Scientist reports the approach detects nearly 97 per cent of spam messages and has a far lower rate of false positives than conventional techniques (less than one in 1,000). IBM is using the filtering techniques, alongside a variety of other approaches, in developing an anti-spam product called SpamGuru . SpamGuru is shipping as a technology preview in Lotus Workplace 2.0, the next version of IBM's messaging and collaboration application. ®
Sponsored: Hyper-scale data management