Feeds

IBM Boffins KNOW WHERE YOU LIVE, thanks to Twitter

"Woohoo I'm in Sydney" tells people you're in Sydney, it seems

Choosing a cloud hosting partner with confidence

If you thought refraining from geotagging your Tweets or photos was enough to keep your secrets from the world at large, think again: IBM researchers say a Twitter user's primary location can be inferred from their behaviour, with accuracy as high as 68 per cent.

In this paper at Arxiv, Jalal Mahmud, Jeffrey Nichols and Clemens Drews of IBM Research at Almaden say they can at least get city-level predictions of Twitter users' “home” locations (by which they mean the primary location from which an individual usually Tweets), even though the user isn't using Twitter's location features.

To do this, the researchers produced two algorithms. The first uses behaviours such as volume of Tweets from a user, and external information (a dictionary of location names and services such as Foursquare). They say that while this algorithm works best when users make “explicit references” of locations in Tweets, it “still works with reduced accuracy when no explicit references are available”.

The second algorithm predicts locations “hierarchically using time zone, state or geographic region as the first level and city at the second level”.

With a dataset of around 1.5 million Tweets from 9,551 users, the researchers then extracted classifiers including:

  • All words in the Tweets;
  • All hashtags in the Tweets; and
  • All city and state location names in the Tweets.

Armed with this data, the researchers then note, they can also make some assumptions about location – for example, given America's timezones, a user in New York is more likely to be at home at 7:00PM eastern time, while at the same time, a Californian user is probably still at work. That means a user's volume of Tweets helps become a hint to their location.

The paper notes that “geo-tags are not used in any of our prediction algorithms, although around 65 per cent of the tweets in our dataset are geo-tagged”.

But don't worry, the researchers only intend their work to be used for good: “a journalist tracking an event on Twitter may want to know which tweets are coming from users who are likely to be in a location of that event, vs. tweets coming from users who are likely to be far away. As another example, a retailer or a consumer products vendor may track trending opinions about their products and services and analyse differences across geographies.

“Second, our examination of the discriminative features used by our algorithms suggests strategies for users to employ if they wish to micro-blog publicly but not inadvertently reveal their location”, the study notes. ®

Security for virtualized datacentres

More from The Register

next story
TEEN RAMPAGE: Kids in iPhone 6 'Will it bend' YouTube 'prank'
iPhones bent in Norwich? As if the place wasn't weird enough
Consumers agree to give up first-born child for free Wi-Fi – survey
This Herod network's ace – but crap reception in bullrushes
Crouching tiger, FAST ASLEEP dragon: Smugglers can't shift iPhone 6s
China's grey market reports 'sluggish' sales of Apple mobe
Sea-Me-We 5 construction starts
New sub cable to go live 2016
New EU digi-commish struggles with concepts of net neutrality
Oettinger all about the infrastructure – but not big on substance
PEAK IPV4? Global IPv6 traffic is growing, DDoS dying, says Akamai
First time the cache network has seen drop in use of 32-bit-wide IP addresses
EE coughs to BROKEN data usage metrics BLUNDER that short-changes customers
Carrier apologises for 'inflated' measurements cockup
Comcast: Help, help, FCC. Netflix and pals are EXTORTIONISTS
The others guys are being mean so therefore ... monopoly all good, yeah?
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Storage capacity and performance optimization at Mizuno USA
Mizuno USA turn to Tegile storage technology to solve both their SAN and backup issues.
The next step in data security
With recent increased privacy concerns and computers becoming more powerful, the chance of hackers being able to crack smaller-sized RSA keys increases.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.
A strategic approach to identity relationship management
ForgeRock commissioned Forrester to evaluate companies’ IAM practices and requirements when it comes to customer-facing scenarios versus employee-facing ones.