“Space vomit” is a truly scientific idiom. At least, it was in the lab. We used this queasy term to describe rendered 3D images of nucleic acids that resembled more the product of an astronaut’s upset stomach than the structure of a molecule. The data were just too noisy and, ultimately, useless for extracting any meaningful insight.

Worse, even when we knew that we had space vomit on our hands, we were still tempted to try and make sense of it. Maybe, one more tweak to the algorithm or input file would suddenly transform the mess into something meaningful? Having given into the temptation too many times, I know first-hand that this kind of salvage moment VERY RARELY happens. And, I think, this is a lesson that many precision medicine disciples could benefit from learning as well.

Okay, what does space vomit have to do with precision medicine? Precision medicine typically conjures the image of using genetic testing to indicate the best medical treatment or lifestyle choices for people (Marcon, Bieber, & Caulfield, 2018). But we have to be honest: in the majority of cases, deducing a medical or lifestyle choice from a genome sequence is akin to trying to make spatial sense of space vomit: It is just too messy.

Indeed, our genomes are incredibly noisy and contain too much “stuff” for us to effectively make sense of it, at least at the moment. Only ~1% of our DNA codes for proteins (the molecules responsible for almost every function in our bodies) (Zhao, 2012). The rest holds not only the instructions for building the protein, but also the instructions for when to use the protein and the instructions for controlling the protein’s activity. In addition to carrying relevant information, the genome can also include DNA from random sources, such as viruses, transposons, bacteria, etc.  (Crisp, Boschetti, Perry, Tunnacliffe, & Micklem, 2015; Soucy, Huang, & Gogarten, 2015). And these are just a few of the numerous complicated variations that our genomes carry.

The noise/complexity problem only gets worse beyond the sequence itself. A person’s DNA gets replicated over and over during the course of a lifetime, and the machinery responsible for this task occasionally makes mistakes (Harris & Nielsen, 2014), which may get passed to the next generation. Some of these mistakes, or mutations, may be expected to give rise to some horrible disease, but even having a “bad” mutation is not a guarantee that the bearer will show clinical symptoms of the disease (Chen et al., 2016)!

Despite the uncertainty, many researchers (and even a growing number of companies) are deeply invested in linking genomic mistakes to various traits and medical problems. But trying to associate complex traits to changes in the genetic code is difficult at best. Recently, a call-to-action has been raised for changing how this is being done (Boyle, Li, & Pritchard, 2017).

Aside from trying to extract insights from the genome, another problem exists: the actual readout of the genetic material. Different entities, such as business and research institutions, usually have different protocols for generating data and algorithms for figuring out the DNA sequence from the data. Now, it may seem that no matter what, the same sequence should be generated. Right? Recently, a JAMA Oncology article described yet another instance of discrepancies when it comes to DNA sequencing (Torga & Pienta, 2017). In this article, researchers sent samples from the same set of patients to two different companies with Clinical Laboratory Improvement Amendments (CLIA) certified labs (i.e., a way of vetting companies that sell diagnostic tests) and got back different sequences for the majority of the patients. So which company provided the “correct” DNA sequence? This is a great question; especially when the choice of medical care made by providers presumably hinges on having a “correct” sequence.

And this brings us back to space vomit: There are instances when knowing the correct DNA sequence can be beneficial in the treatment of cancer, such as melanoma and some types of lung cancer (Harris, 2018). But these occurrences may be the exception rather than the rule, though they might be what cause people to give into the temptation of continuing to hammer away at noisy, complex, and potentially incorrect data to find the magical answer for medical treatments or lifestyle choices. It is reassuring to know that awareness is growing about the limitations of the DNA sequence for determining better medical intervention (Harris, 2018). Perhaps the precision medicine field will now turn their attention to other, less space vomit-like data types that may more readily and easily lead to better medical or lifestyle choices.


Boyle, E. A., Li, Y. I., & Pritchard, J. K. (2017). An Expanded View of Complex Traits: From Polygenic to Omnigenic. Cell, 169(7), 1177-1186. doi:10.1016/j.cell.2017.05.038

Chen, R., Shi, L., Hakenberg, J., Naughton, B., Sklar, P., Zhang, J., . . . Friend, S. H. (2016). Analysis of 589,306 genomes identifies individuals resilient to severe Mendelian childhood diseases. Nat Biotechnol, 34(5), 531-538. doi:10.1038/nbt.3514

Crisp, A., Boschetti, C., Perry, M., Tunnacliffe, A., & Micklem, G. (2015). Expression of multiple horizontally acquired genes is a hallmark of both vertebrate and invertebrate genomes. Genome Biol, 16, 50. doi:10.1186/s13059-015-0607-3

Harris, K., & Nielsen, R. (2014). Error-prone polymerase activity causes multinucleotide mutations in humans. Genome Res, 24(9), 1445-1454. doi:10.1101/gr.170696.113

Harris, R. (2018, January 15). For Now, Sequencing Cancer Tumors Holds More Promise Than Proof. NPR. Retrieved from https://www.npr.org/sections/health-shots/2018/01/15/572940706/for-now-sequencing-cancer-tumors-holds-more-promise-than-proof.

Marcon, A. R., Bieber, M., & Caulfield, T. (2018). Representing a “revolution”: how the popular press has portrayed personalized medicine. Genet Med. doi:10.1038/gim.2017.217

Soucy, S. M., Huang, J., & Gogarten, J. P. (2015). Horizontal gene transfer: building the web of life. Nat Rev Genet, 16(8), 472-482. doi:10.1038/nrg3962

Torga, G., & Pienta, K. J. (2017). Patient-Paired Sample Congruence Between 2 Commercial Liquid Biopsy Tests. JAMA Oncol. doi:10.1001/jamaoncol.2017.4027

Zhao, R. (2012, November 8). ENCODE: Deciphering Function in the Human Genome. Retrieved from https://www.genome.gov/27551473/genome-advance-of-the-month-encode-deciphering-function-in-the-human-genome/.