Australasian Science: Australia's authority on science since 1938

Genetic Privacy at Risk

By Michael Cook

US intelligence agencies may be analysing our communications on a massive scale, but genetic data is already proving just as vulnerable.

By now you must be used to the idea of the US National Security Agency siphoning up your Facebook account, your email account, your Skype calls, your phone records, your chats and your browsing history. What’s left for them to trawl through?

How about your genome?

As the public reels from revelations about how much American information intelligence agencies have been gathering about foreigners and their own citizens, few have yet twigged to the vulnerability of genetic data.

“We are at a crucial juncture brought about by the confluence of new technologies for data generation, bioinformatics, and information access on the one hand, which seem to create new risks to privacy, and the public’s desire to benefit from these advances for a variety of personal and health reasons on the other hand,” scientists from the US National Institutes of Health wrote earlier this year in Science.

Violations of genetic privacy are already beginning to surface. In 2005 a teenager born from an anonymous sperm donation tracked down his biological father by combining Y chromosome data with genealogical information. Whatever you think about the dad’s ethics, promises made to him about anonymity proved to be illusory.

In June, London’s Times splashed the genetic history of Prince Harry and Prince William across its front page – information eagerly taken up by the media in Britain and India, as William will be the first monarch with Indian forebears. But this information was not only published without their consent; it was obtained without their consent. It was inferred from a genetic analysis of two of their distant cousins by a private company, Britain’s DNA.

But the biggest development came earlier in January. Geneticists were shocked when an Israeli scientist showed that privacy invasion is possible on a grand scale. Yaniv Erlich, a geneticist at the Whitehead Institute for Biomedical Research in Cambridge, Massachusetts, published a paper in Science showing that it was possible to identify the people who contributed DNA to research projects. He selected five DNA samples from a database of 1000 and was able to identify the persons from whom they came. All of the information was publicly available on the internet. “It was a very weird feeling – a ‘wow’ feeling,” Erlich told Associated Press. “I had to take a walk outside just to think about this process.”

Anyone can do it. A free program called lobSTR can be downloaded that selects haplotypes from the Y chromosome of a genome. Typically the genomic information has been de-identified, but it retains the age of the donor and the state of residence. The haplotypes are entered in free recreational genealogy websites that match them with surnames. By triangulating the data, it is possible to identify the donor.

And not only the donor, but his relatives as well. Erlich estimates that the 135,000 records in the two largest genealogical databases could be used to target several million American males. This figure will grow. Several thousand records are added to the databases every month, and genetic sequencing technology is becoming ever more powerful.

The loss of genetic privacy poses immense problems. Scientists worry that people will be reluctant to donate genetic material. “Researchers need to show the public that they are acting as careful stewards of the data entrusted to them,” an editorial in Nature warns. At the moment researchers assure participants that it will be hard for anyone to find out anything about them personally from their research. This is no longer true. “If you believe you can just encrypt terabytes of data or anonymize them, there will always be people who hack through that,” says Harvard geneticist George Church.

Some researchers contend that access to databases should require a licence, with severe penalties for scientists who breach privacy regulations. But research centres are relatively well organised and staffed by professionals.

What about the enormous genetic databases stored with police biobanks, hospitals and private genetic diagnostic centres? Will it be possible to hack into these and steal genetic data that could be used for identity fraud, targeting relatives, blackmail, discrimination against people with genetic diseases or pharmaceutical marketing?

The genome revolution has enormous potential for medicine, but there is a dark side. Our most intimate information, our DNA, is no longer our own.

Michael Cook is editor of BioEdge, a bioethics newsletter.