Security rEsrchRs find nu way 2 spot TXT spam

Symantec boffins analyse 400,000 TXTs to develop new spam-spotting approach

Secure remote control for conventional and virtual desktops

Symantec boffins reckon it's no longer enough to shield e-mail users from malicious email and that spam and phishing over SMS are now worthy of some decent defences. They've even penned a study to back up the proposition, suggesting that SMS spam could be 97 per cent detectable with a false positive rate as low as 0.02 per cent.

The researchers, from Symantec offices in the UK, Ireland and the US, have published their paper at Arxiv saying that although spam detection in SMS is harder than in e-mail, it can be done.

SMS remains popular – even in an era of over-the-top messaging platforms that want to eat the carriers' lunch by shifting their texts to the data channel – and the paper argues that various habits in SMS make spam detection a problem. They cite “lexical variants”, along with contractions, wordplay and other obfuscations as posing challenges for anyone wanting to detect malicious messages.

With better baselines, the researchers argue, including text normalisation and substring clustering, these problems could be overcome.

Working with an unnamed US carrier, Symantec was able to use a large SMS dataset to test their machine learning approaches to spam-blocking. To avoid false positives, they note, they also used “a combination of behavioural and linguistic information” to get more robust results.

The researchers had around 400,000 text messages to work with (including 300,000 spams), allowing them to test what they describe as “clustered substring tokens from a subset of 100k messages using t-distributed stochastic neighbour embeddings … string similarity functions based on matching n-grams and word co-occurrences.”

To expand the total training data set, the researchers also cleaned up 200,000 Twitter messages (removing hashtags and user mentions). Their study used two approaches: MELA (message linguistic analysis) and MPA (messaging pattern analysis).

The MELA approach showed a 0.05 per cent false positive and 9.4 per cent false negative rate, the paper says, while MPA scored a much better 0.02 per cent false positives and just 3.1 per cent false negatives. ®

Intelligent flash storage arrays


Seattle children’s accelerates Citrix login times by 500% with cross-tier insight
Seattle Children’s is a leading research hospital with a large and growing Citrix XenDesktop deployment. See how they used ExtraHop to accelerate launch times.
Why CIOs should rethink endpoint data protection in the age of mobility
Assessing trends in data protection, specifically with respect to mobile devices, BYOD, and remote employees.
A strategic approach to identity relationship management
ForgeRock commissioned Forrester to evaluate companies’ IAM practices and requirements when it comes to customer-facing scenarios versus employee-facing ones.
High Performance for All
While HPC is not new, it has traditionally been seen as a specialist area – is it now geared up to meet more mainstream requirements?
Protecting against web application threats using SSL
SSL encryption can protect server‐to‐server communications, client devices, cloud resources, and other endpoints in order to help prevent the risk of data loss and losing customer trust.