Development and validation of a blood-based protein predictor of 20-year dementia risk from middle age


In the US the number of individuals affected by dementia is expected to double by 2040. Thus, tools enabling identification of at-risk individuals earlier in disease progression, or before disease onset, are vital. Proteomics-driven risk stratification for dementia may enable the initiation of life-planning and/or elicit changes in modifiable lifestyle-related risk factors. Moreover, test results may facilitate enrichment of clinical trial enrollment, and ultimately as an aid in precision prescribing of preventative therapeutics. Therefore, we sought to develop and validate a blood-based protein-only predictor of 20-year dementia risk from middle age.


Using modified-aptamer proteomics technology, SomaScan® Assay v4.0, we scanned ~5,000 proteins in 11,277 EDTA plasma samples from individuals aged 49-73 without known dementia at visit 3 blood draw of the Atherosclerosis Risk in Communities (ARIC) study, totaling ~56 million protein measurements. A total 1,305 incident dementia diagnoses occurred within 20 years of blood-draw. Time to dementia diagnosis events were modeled with protein measurements using machine learning methods in 70% of ARIC V3 as a training dataset. A model was selected based on performance in a 15% holdout sample subset and validated in the remaining 15% samples. In a subset of 3,852 individuals with known ApoE status, the performance of the proteomic model was compared to that of the predictive performance of risk genotype.


A 25-feature protein-only accelerated-failure-time Weibull model was successfully developed to predict the probability of a dementia diagnosis within 20 years of blood draw. Model performance was measured using AUC, which was equal to 0.73, 0.73, and 0.70, in the training, model selection, and validation datasets, respectively. Based on predicted probabilities from the model, individuals were stratified into low, medium-low, medium-high, and high risk bins, with a 20-year event rate of 3.79% vs 35.58% in low vs high risk bins. Model performance was significantly superior to that of ApoE genotype.
Conclusion: We successfully developed a well calibrated protein-only model with a 10-fold dynamic range that predicts 20-y risk of a dementia diagnosis. Performance of the protein model out-performs leading genetic risk factors for dementia and given the exclusion of immutable factors it has the potential to monitor changing risk.


Clare Paterson, PhD,
Amy Zhang, PhD,
Rachel Ostroff, PhD,
Yolanda Hagar, PhD,
Kelsey Loupy, PhD
Stephen Williams, MD, PhD
SomaLogic, Boulder, CO, USA

Share with colleagues

More posters

PosterUtility of proteomic trajectories of cardiovascular risk and cardiorespiratory fitness to monitor adverse health states throughout post-COVID-19 illness

Cardiovascular involvement is a prominent observation in patients during the acute phase of COVID-19 infection, as well as in convalescence. However, the etiology, trajectory, and underlying biology of cardiac dysfunction across the spectrum of COVID-19 illness is not fully understood. To address this, the CISCO-19 study (NCT04403607) was formed to investigate the multisystem effects of COVID-19 from hospitalized patients

Learn more

PosterIdentifying genetic and environmental influences on proteins associated with age, cardiovascular risk, and other endpoints using the SomaScan® Assay

Protein quantitative trait locus pQTL studies identify genetic variants that are statistically associated with protein levels Results from the growing number of pQTL studies can be combined with genome wide association studies to identify proteins that underlie the genetic risk of disease, thus revealing the mechanisms of disease and potential drug targets.

Learn more

PosterSomaScan® Platform confirmation and performance validation

The SomaScan® Platform for proteomic profiling uses 7288 (7K) SOMAmer® reagents, single stranded DNA aptamers, to 6596 unique Human Protein Targets. The modified aptamer binding reagents1, SomaScan assay2, its performance characteristic for 5k3 and 7k4 content sets, and specificity5,6,7 to human targets have been previously described. We combine profiles of validation and performance metrics with orthogonal confirmation of specificity from published literature to provide a comprehensive view of the specificity and utility of the SomaScan Platform.

Learn more

Explore posters in our interactive viewer