Your genome has been hacked! …literally. Recently, a direct-to-consumer genetic test company was hacked and  information about millions of people was compromised (Thielking, 2018). In this day and age, it is not surprising when a company gets hacked. What is surprising is one of the ways a genetics company can be hacked. The sneaky route involves the malware being programmed into a DNA sample. When sequenced, the malware grants the hackers access. Sounds like a Hollywood movie plot, but it happened (Thielking, 2018).

Just how private is our biological information, and are people concerned? In a recent survey, about 47% people had concerns about the confidentiality of their genetic information derived from genealogical genetic testing (Hensley, 2018). Sadly, genetic information is difficult to keep secret. Aside from companies or other holders of genetic information being hacked, you can also lose your genetic privacy if your blood relatives choose to give away their genetic info like beads at Mardi Gras. This should not be a surprise, given the growing forensic use of relatives’ DNA to crack cold cases.

Is this only a genes problem, or are an individual’s proteins also a privacy risk? Although proteins can change rapidly in response to many different environmental changes, we can be named by our proteins. Not too long ago, a group of researchers used a mass spectrometry approach to identify a person based on the proteins found in a hair sample (Parker et al., 2016). They demonstrated that if a person carries a genetic variation that alters the amino acid sequence of a protein, then that person could be identified based on the protein sequence in the sample. Although hair has a relatively small number of proteins, it is likely that more complex protein samples, such as blood, could also be used to single out a person.

Is there a way to make mass spectrometry less revealing? One proposal is to remove some of the potentially distinguishing  data, which would make it harder to link it back to a specific individual, but not impossible (Li, Bandeira, Wang, & Tang, 2016). Nevertheless, this is a step in the right direction, but it may already be too late.

A quick search on Google reveals numerous repositories where proteomic data have been shared. When it comes time to publish a scientific paper, some journals, such as PLOS journals), make sharing proteomic data mandatory (http://journals.plos.org/plosone/s/data-availability). It is worth noting that a few exceptions to making the data available do exist, but a chance of the data holder being hacked still remains. Although some journals merely recommend sharing the data (e.g., Cell https://www.cell.com/cell/authors), there are growing cries for more data transparency, including (Matheson, 2018). The government has made a similar proposition (Friedman, 2018). With such a demand for transparency and access to data, can we still hold onto our beloved privacy? And how does this affect people’s willingness to donate biological samples or partake in clinical studies? The only thing that is certain now is that once the data are out, there is no way to secure them again.

 

References

Curran, A. M., Fogarty Draper, C., Scott-Boyer, M. P., Valsesia, A., Roche, H. M., Ryan, M. F., . . . Kaput, J. (2017). Sexual Dimorphism, Age, and Fat Mass Are Key Phenotypic Drivers of Proteomic Signatures. J Proteome Res, 16(11), 4122-4133. doi:10.1021/acs.jproteome.7b00501

Friedman, L. (2018, March 26) The E.P.A. Says It Wants Research Transparency. Scientists See an Attack on Science. New York Times. (Retrieved on June 6, 2018 from https://www.nytimes.com/2018/03/26/climate/epa-scientific-transparency-honest-act.html).

Hensley, S. (2018, June 1) POLL: Genealogical Curiosity Is A Top Reason For DNA Tests; Privacy A Concern. NPR. (Retrieved on June 3, 2018 from https://www.npr.org/sections/health-shots/2018/06/01/616126056/poll-genealogical-curiosity-is-a-top-reason-for-dna-tests-privacy-a-concern).

Li, S., Bandeira, N., Wang, X., & Tang, H. (2016). On the privacy risks of sharing clinical proteomics data. AMIA Jt Summits Transl Sci Proc, 2016, 122-131.

Matheson S. (2018, May 30) Why you should deposit your raw data. Crosstalk [blog post]. Retrieved on June 6, 2018 from http://crosstalk.cell.com/blog/why-you-should-deposit-your-raw-data).

Parker, G. J., Leppert, T., Anex, D. S., Hilmer, J. K., Matsunami, N., Baird, L., . . . Leppert, M. (2016). Demonstration of Protein-Based Human Identification Using the Hair Shaft Proteome. PLoS One, 11(9), e0160653. doi:10.1371/journal.pone.0160653

Thielking, M. (2018, June 5) Genealogy site MyHeritage says 92 million user accountscompromised. STAT. (Retrieved on June 6, 2018 from https://www.statnews.com/2018/06/05/genealogy-site-myheritage-says-92-million-user-accounts-compromised/).