Lung Cancer risk in ever smokers


Lung cancer is the second most common cancer type and is the leading cause of cancer death globally, with smoking and advancing age as the leading causal risk factors. The USPSTF guidelines for lung cancer screening recommends annual screening for select current or former smokers over 50 years of age. While annual screening via low dose CT has been demonstrated to decrease lung cancer mortality, compliance with screening guidelines remains low. Additional prognostic tools for future lung cancer risk stratification, particularly those without immutable demographic and health history, may be beneficial in increasing screening compliance and monitoring changing risk across time.


Using modified-aptamer proteomics technology, SomaScan® Assay v4.0 (Fig 1), we scanned ~5,000 proteins in 6,085 EDTA plasma samples from “Ever Smokers” (current or former smokers, aged 50-73) with no known prevalent cancer at visit 3 of the Atherosclerosis Risk in Communities (ARIC) study, for a total of ~30 million protein measurements. A total of 348 incident lung cancer diagnoses occurred in this sample set, with 75 occurring within 5 years of visit 3 blood-draw. Time to lung cancer diagnosis events were modeled with protein measurements using machine learning methods in 70% of ARIC visit 3 ever smokers. A model was selected based on performance in a 15% holdout sample subset and validated in the remaining 15% ARIC visit 3 samples not used for model training or selection.


A 7-feature protein-only accelerated failure time (AFT) Weibull model was successfully developed to predict the probability of a lung cancer diagnosis within 5 years of blood draw. Model performance in training, model selection, and validation datasets was AUC equal to 0.76, 0.72, and 0.83, respectively. Based on predicted probabilities from the model, individuals were stratified into 3 risk bins (low, medium, and high) with a 5-year event rate of 0.49% vs 2.74% in low vs high risk bins.


We successfully developed a blood-based protein-only model that predicts risk of developing lung cancer in ever smokers. Performance of the protein model out-performs traditional risk factors for lung cancer and given the lack of immutable factors it has the potential to provide real-time risk which can be repeatedly assessed over time. Proteomics-driven risk stratification may have the ability to increase adherence to lung cancer screening guidelines and/or influence a positive behavior change in modifiable riskrelated behaviors.

Share with colleagues

More posters

PosterUtility of proteomic trajectories of cardiovascular risk and cardiorespiratory fitness to monitor adverse health states throughout post-COVID-19 illness

Cardiovascular involvement is a prominent observation in patients during the acute phase of COVID-19 infection, as well as in convalescence. However, the etiology, trajectory, and underlying biology of cardiac dysfunction across the spectrum of COVID-19 illness is not fully understood. To address this, the CISCO-19 study (NCT04403607) was formed to investigate the multisystem effects of COVID-19 from hospitalized patients

Learn more

PosterIdentifying genetic and environmental influences on proteins associated with age, cardiovascular risk, and other endpoints using the SomaScan® Assay

Protein quantitative trait locus pQTL studies identify genetic variants that are statistically associated with protein levels Results from the growing number of pQTL studies can be combined with genome wide association studies to identify proteins that underlie the genetic risk of disease, thus revealing the mechanisms of disease and potential drug targets.

Learn more

PosterSomaScan® Platform confirmation and performance validation

The SomaScan® Platform for proteomic profiling uses 7288 (7K) SOMAmer® reagents, single stranded DNA aptamers, to 6596 unique Human Protein Targets. The modified aptamer binding reagents1, SomaScan assay2, its performance characteristic for 5k3 and 7k4 content sets, and specificity5,6,7 to human targets have been previously described. We combine profiles of validation and performance metrics with orthogonal confirmation of specificity from published literature to provide a comprehensive view of the specificity and utility of the SomaScan Platform.

Learn more

Explore posters in our interactive viewer