Lung Cancer risk in ever smokers


Lung cancer is the second most common cancer type and is the leading cause of cancer death globally, with smoking and advancing age as the leading causal risk factors. The USPSTF guidelines for lung cancer screening recommends annual screening for select current or former smokers over 50 years of age. While annual screening via low dose CT has been demonstrated to decrease lung cancer mortality, compliance with screening guidelines remains low. Additional prognostic tools for future lung cancer risk stratification, particularly those without immutable demographic and health history, may be beneficial in increasing screening compliance and monitoring changing risk across time.


Using modified-aptamer proteomics technology, SomaScan® Assay v4.0 (Fig 1), we scanned ~5,000 proteins in 6,085 EDTA plasma samples from “Ever Smokers” (current or former smokers, aged 50-73) with no known prevalent cancer at visit 3 of the Atherosclerosis Risk in Communities (ARIC) study, for a total of ~30 million protein measurements. A total of 348 incident lung cancer diagnoses occurred in this sample set, with 75 occurring within 5 years of visit 3 blood-draw. Time to lung cancer diagnosis events were modeled with protein measurements using machine learning methods in 70% of ARIC visit 3 ever smokers. A model was selected based on performance in a 15% holdout sample subset and validated in the remaining 15% ARIC visit 3 samples not used for model training or selection.


A 7-feature protein-only accelerated failure time (AFT) Weibull model was successfully developed to predict the probability of a lung cancer diagnosis within 5 years of blood draw. Model performance in training, model selection, and validation datasets was AUC equal to 0.76, 0.72, and 0.83, respectively. Based on predicted probabilities from the model, individuals were stratified into 3 risk bins (low, medium, and high) with a 5-year event rate of 0.49% vs 2.74% in low vs high risk bins.


We successfully developed a blood-based protein-only model that predicts risk of developing lung cancer in ever smokers. Performance of the protein model out-performs traditional risk factors for lung cancer and given the lack of immutable factors it has the potential to provide real-time risk which can be repeatedly assessed over time. Proteomics-driven risk stratification may have the ability to increase adherence to lung cancer screening guidelines and/or influence a positive behavior change in modifiable riskrelated behaviors.

Share with colleagues

More posters

PosterDementia risk from middle age

In the US the number of individuals affected by dementia is expected to double by 2040. Thus, tools enabling identification of at-risk individuals earlier in disease progression, or before disease onset, are vital.

Learn more

PosterUrinary proteome

Urinary proteome and its application to predict cardiovascular risk in patients with stable Coronary Heart Disease.

Learn more

PosterLiquid liver biopsy

A liquid liver biopsy: serum protein patterns of liver steatosis, inflammation, hepatocyte ballooning and fibrosis in NAFLD and NASH.

Learn more

Explore posters in our interactive viewer