Berkeley boffins build better spear-phishing black-box bruiser
Machine learning and code to detect and alert attempts to extract passwords from staff
Security researchers from UC Berkeley and the Lawrence Berkeley National Laboratory in the US have come up with a way to mitigate the risk of spear-phishing in corporate environments.
In a paper presented at Usenix 2017, titled "Detecting Credential Spearphishing in Enterprise Settings," Grant Ho, Mobin Javed, Vern Paxson, and David Wagner from UC Berkeley, and Aashish Sharma of The Lawrence Berkeley National Laboratory (LBNL), describe a system that utilizes network traffic logs in conjunction with machine learning to provide real-time alerts when employees click on suspect URLs embedded in emails.
Spear-phishing is a social engineering attack that involves targeting specific individuals with email messages designed to dupe the recipient into installing a malicious file or visiting a malicious website.
Such targeted attacks are less common than phishing attacks launched without a specific victim in mind, but they tend to be more damaging. High profile data thefts at the Office of Personnel Management (22.1 million people) and at health insurance provider Anthem (80 million patient records), among others, have involved spear-phishing.
The researchers are concerned specifically with credential theft since it has fewer barriers to success than exploit-based attacks. If malware is involved, diligent patching and other security mechanisms may offer defense, even if the target has been fooled. If credentials are sought, tricking the target into revealing the data is all that's required.
The researchers focus on dealing with attacks that attempt to impersonate a trusted entity, which may involve spoofing the name field in emails, inventing name that's plausibly trustworthy, like firstname.lastname@example.org, or messages delivered from a compromised trusted account. Another means of impersonation, email address spoofing, is not considered because it can be dealt with through email security mechanisms like DKIM and DMARC.
The challenge in automating spear-phishing detection is that such attacks are rare, which is why many organizations still rely on user reports to trigger an investigation. The researchers note that their enterprise dataset contains 370 million emails – about four years worth – and only 10 known instances of spear-phishing.
So even a false positive rate of 0.1 per cent would mean 370,000 false alarms, enough to paralyze a corporate IT department. And the relative scarcity of spear-phishing examples ensures that machine learning techniques lack the volume of data to create a viable training model.