Feeds

Public genome databases can leak identity

Anonymity only goes so far

The Power of One Infographic

Public genome data is a significant risk to individuals, according to research led out by Yaniv Elrich, a geneticist at the Whitehead Institute for Biomedical Research.

The team that Elrich led was able to de-anonymise genome data using only public information and careful Internet searches. A little chillingly, individuals could be associated with patrilineal genetic characteristics, even if they weren’t in the databases. A family member’s presence in the database can be enough, if they’re related in the male line and carry the same surname.

Working with data published in two public genomic databases, Ysearch and SMGF, Elrich demonstrated the privacy risk by matching chromosome data with 50 individuals, in a paper published in Science (abstract here, full paper available free with registration).

Among the genome data recorded in the databases is a genetic marker called “short tandem repeats” (for which genetic science hasn’t yet identified a specific purpose), which are passed down the male line.

As the paper notes, it had been assumed that listing surnames in the databases didn’t place individual identity at risk, since surnames “could match thousands of individuals”. However, the genome data has become a genealogy tool as well, in databases such as YBase.

DNA sequencing pioneer Dr Craig Venter volunteered as a test subject in the research. With only the relevant DNA sequence, Dr Venter’s age, and the US state where he lives, Erlich was able to retrieve just two possible records – one of which was Dr Venter.

With a known surname, the searches become even more accurate: “Combining the recovered surname with additional demographic data can narrow down the identity of the sample originator to just a few individuals,” Erlich states in the paper.

“Surname inference from personal genomes puts the privacy of current de-identified public data sets at risk”, it continues.

“In five surname recovery cases, we fully identified the CEU* individuals and their entire families with very high probabilities … data release, even of a few markers, from one person can spread through deep genealogical ties and lead to the identification of another person who might have no acquaintance with the person who released his genetic data”. ®

*CEU refers to a particular genetic dataset: “multigenerational families of northern and western European ancestry in Utah who had originally had their samples collected by CEPH (Centre d’Etude du Polymorphisme Humain)”. ®

Eight steps to building an HP BladeSystem

More from The Register

next story
Malaysian Airlines flight MH17 claimed lives of HIV/AIDS cure scientists
Researchers, advocates, health workers among those on shot-down plane
Forty-five years ago: FOOTPRINTS FOUND ON MOON
NASA won't be back any time soon, sadly
The Sun took a day off last week and made NO sunspots
Someone needs to get that lazy star cooking again before things get cold around here
Mwa-ha-ha-ha! Eccentric billionaire Musk gets his PRIVATE SPACEPORT
In the Lone Star State, perhaps appropriately enough
MARS NEEDS OCEANS to support life - and so do exoplanets
Just being in the Goldilocks zone doesn't mean there'll be anyone to eat the porridge
Diary note: Pluto's close-up is a year from … now!
New Horizons is less than a year from the dwarf planet
Boffins discuss AI space program at hush-hush IARPA confab
IBM, MIT, plenty of others invited to fill Uncle Sam's spy toolchest, but where's Google?
prev story

Whitepapers

Reducing security risks from open source software
Follow a few strategies and your organization can gain the full benefits of open source and the cloud without compromising the security of your applications.
Consolidation: The Foundation for IT Business Transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.
Application security programs and practises
Follow a few strategies and your organization can gain the full benefits of open source and the cloud without compromising the security of your applications.
Boost IT visibility and business value
How building a great service catalog relieves pressure points and demonstrates the value of IT service management.
Consolidation: the foundation for IT and business transformation
In this whitepaper learn how effective consolidation of IT and business resources can enable multiple, meaningful business benefits.