Lung Cancer risk in ever smokers


Lung cancer is the second most common cancer type and is the leading cause of cancer death globally, with smoking and advancing age as the leading causal risk factors. The USPSTF guidelines for lung cancer screening recommends annual screening for select current or former smokers over 50 years of age. While annual screening via low dose CT has been demonstrated to decrease lung cancer mortality, compliance with screening guidelines remains low. Additional prognostic tools for future lung cancer risk stratification, particularly those without immutable demographic and health history, may be beneficial in increasing screening compliance and monitoring changing risk across time.


Using modified-aptamer proteomics technology, SomaScan® Assay v4.0 (Fig 1), we scanned ~5,000 proteins in 6,085 EDTA plasma samples from “Ever Smokers” (current or former smokers, aged 50-73) with no known prevalent cancer at visit 3 of the Atherosclerosis Risk in Communities (ARIC) study, for a total of ~30 million protein measurements. A total of 348 incident lung cancer diagnoses occurred in this sample set, with 75 occurring within 5 years of visit 3 blood-draw. Time to lung cancer diagnosis events were modeled with protein measurements using machine learning methods in 70% of ARIC visit 3 ever smokers. A model was selected based on performance in a 15% holdout sample subset and validated in the remaining 15% ARIC visit 3 samples not used for model training or selection.


A 7-feature protein-only accelerated failure time (AFT) Weibull model was successfully developed to predict the probability of a lung cancer diagnosis within 5 years of blood draw. Model performance in training, model selection, and validation datasets was AUC equal to 0.76, 0.72, and 0.83, respectively. Based on predicted probabilities from the model, individuals were stratified into 3 risk bins (low, medium, and high) with a 5-year event rate of 0.49% vs 2.74% in low vs high risk bins.


We successfully developed a blood-based protein-only model that predicts risk of developing lung cancer in ever smokers. Performance of the protein model out-performs traditional risk factors for lung cancer and given the lack of immutable factors it has the potential to provide real-time risk which can be repeatedly assessed over time. Proteomics-driven risk stratification may have the ability to increase adherence to lung cancer screening guidelines and/or influence a positive behavior change in modifiable riskrelated behaviors.

Share with colleagues

More posters

PosterLatest research shows benefit of non-invasive, high-plex protein profiling for liver disease

Learn about using high-plex, aptamer-based protein profiling for NASH research through these three SomaLogic assets. Watch one webinar and download two posters that each highlight NASH research.

Learn more

PosterProteomic Indicators of Metabolic Health in Diabetes and Social Deprivation

Understanding the health impacts of socioeconomic deprivation (SED) and its interaction with type 2 diabetes is important for patient care and effective public health initiatives. Large-scale proteomic profiling using aptamer-based technology to measure 7,000 proteins has facilitated the development of blood-based proteomic signatures for 11 cardiometabolic SomaSignalTM Tests (SST)

Learn more

PosterHeritability, pQTLs, and environmental influence on proteins involved in age, cardiovascular risk, and glucose tolerance using the SomaScan® Assay

Protein quantitative trait locus (“pQTL”) studies identify genetic variants that are statistically associated with protein levels. Results from the growing number of pQTL studies can be combined with genome-wide association studies to identify proteins that underlie the genetic risk of disease, thus revealing the mechanisms of disease and potential drug targets.

Learn more

Explore posters in our interactive viewer