How the CIA, Comcast can snoop on your sleep patterns, sex toy usage
The smart home may need to get a whole lot smarter, researchers warn
Smart home devices supply much more personal information than you might imagine – even when the data is encrypted – it appears.
In a study [PDF] of seven popular products, the team from Princeton University in the US decided to dig into how much they could figure out about a person's daily habits just by analyzing the internet traffic their gizmos produce.
It turns out to be quite a lot and, the team noted this month, the recent decision by the FCC, America's comms watchdog, to scrap broadband privacy rules days before they were due to come into effect means that your ISP is able to gather that information with its existing data collection.
Their paper – Spying on the Smart Home: Privacy Attacks and Defenses on Encrypted IoT Traffic – reveals that even when data from devices is encrypted, the metadata can help identify both the device and what it is signaling.
Some devices such as the Nest indoor camera directly communicate with identifiable domain names – in this case 'dropcam.com.' That immediately identifies what the product is, and it is then possible to infer from that and the resulting signal what is happening: whether it has detected motion or whether it is live streaming.
Likewise the Sense sleep monitor, TP‑Link smart plug, and Amazon Echo. Even when the devices communicate with a generic DNS server – like Amazon's AWS service – they typically have a specific IP address that can be used to identify the sensor (the Belkin WeMo switch for example communicated with the very-specific prod1-fs-xbcs-net-1101221371.us-east-1.elb.amazonaws.com address).
By digging into each device's signal, the team was able to figure out with some certainty exactly what was happening: someone was waking up, someone was turning on a light switch, someone had walked into the kitchen, and so on.
Given the fact that the same patterns are repeated, it would be very easy for an ISP to build a model that instantly analyzed and stored such patterns. And if an ISP can do it, anyone who can grab your internet traffic would be able to do the same.
"Smart home network traffic is susceptible to eavesdropping by other parties," they warn. "Such parties include ISPs, Wi‑Fi eavesdroppers, or state-level surveillance entities."
And while the team did not use an internet-connected sex toy in its test, it did note that the exact same analysis would reveal use of such items.
Which begs the question: how can you stop the CIA – or Comcast – keeping tabs on your dildo use?
The team dug into various methods, including:
- Cutting the devices off from the outside internet
- Using a VPN to shield traffic
- Adding noise to the system to disguise usage
Only the last method proved satisfactory, with the researchers noting with some surprise that several devices simply stopped working altogether if they didn't have an internet connection. Others lost enough functionality that they were basically equivalent to non-smart home (and much cheaper) products.
The VPN method was pretty good at disguising traffic, since it effectively strips the DNS interactions out from grabbable traffic. But the team was still able to discern a lot of information based on the time, type and amount of traffic.
"A smart door lock and smart sleep monitor are less likely to be recording user activity simultaneously," it notes. "Traffic observations from particular times of day are likely to contain non-background traffic from only one of these devices."
What did work, however, was adding noise to the system through independent link padding (ILP). The team wrote some code (under 100 lines, they say) that ran on the router and padded or fragmented all data packets to a constant size, and then buffered traffic or sent cover traffic to hide actual device data.
Basically, rather than seeing clear flows and peaks of data, the system spread that data out over time and ran the same amount of data constantly – making it almost impossible to figure out what was really going on.
This brings its own problems however:
- Greater data usage
The system is much simpler when only very low-traffic devices such as smart plugs were in use. But when audio and video devices were used, the sheer amount of data meant that the effort to disguise it caused latency.
The worst impact was a 10‑15 second delay in getting video from the smartcam to a mobile device – which can be incredibly annoying – and a few seconds delay in the Amazon Echo answering a question (also annoying). With high latency, the Echo cut off its reply mid-sentence, the team found.
Although the researchers acknowledge that the ILP approach is often written off for creating too much unnecessary data, by their calculations, simple smart devices only added 2.5GB of data a month. However, with audio and video devices added in, that jumps to 104GB a month. They claim this is still far below a common cap of 1TB for many ISPs. However, that is still a hefty data cost for additional privacy.
What are other possible variations on this approach?
A random or variable data pattern – rather than a flat, constant data stream – could help disguise traffic while also helping bring down data usage and latency.
Device manufacturers could also rethink how their devices send information. It is not necessary, for example, for a sleep device to send information instantly – it could delay the sending of data for several hours with no discernible impact.
People could also combine the use of a VPN with a less-aggressive form of ILP to add privacy at a lower cost. Or router manufacturers could give people the option to select different levels of privacy on a dial-like setup to get the right level of performance combined with privacy. Or allow some devices to respond unimpeded but pull others into a privacy mix.
Ultimately the paper does a good job at identifying something that is going to become an ever-greater concern: people's personal habits effectively becoming visible – and saleable – through traffic analysis.
We could easily see a router manufacturer figuring out a way to disguise such traffic and use a new privacy setting as a unique selling point, or to differentiate themselves in the market. But as this paper demonstrates, such packet manipulation can become complex quite quickly and would likely come with some compromises over speed and quality. ®