Development and validation of a blood-based protein predictor of 20-year dementia risk from middle age


In the US the number of individuals affected by dementia is expected to double by 2040. Thus, tools enabling identification of at-risk individuals earlier in disease progression, or before disease onset, are vital. Proteomics-driven risk stratification for dementia may enable the initiation of life-planning and/or elicit changes in modifiable lifestyle-related risk factors. Moreover, test results may facilitate enrichment of clinical trial enrollment, and ultimately as an aid in precision prescribing of preventative therapeutics. Therefore, we sought to develop and validate a blood-based protein-only predictor of 20-year dementia risk from middle age.


Using modified-aptamer proteomics technology, SomaScan® Assay v4.0, we scanned ~5,000 proteins in 11,277 EDTA plasma samples from individuals aged 49-73 without known dementia at visit 3 blood draw of the Atherosclerosis Risk in Communities (ARIC) study, totaling ~56 million protein measurements. A total 1,305 incident dementia diagnoses occurred within 20 years of blood-draw. Time to dementia diagnosis events were modeled with protein measurements using machine learning methods in 70% of ARIC V3 as a training dataset. A model was selected based on performance in a 15% holdout sample subset and validated in the remaining 15% samples. In a subset of 3,852 individuals with known ApoE status, the performance of the proteomic model was compared to that of the predictive performance of risk genotype.


A 25-feature protein-only accelerated-failure-time Weibull model was successfully developed to predict the probability of a dementia diagnosis within 20 years of blood draw. Model performance was measured using AUC, which was equal to 0.73, 0.73, and 0.70, in the training, model selection, and validation datasets, respectively. Based on predicted probabilities from the model, individuals were stratified into low, medium-low, medium-high, and high risk bins, with a 20-year event rate of 3.79% vs 35.58% in low vs high risk bins. Model performance was significantly superior to that of ApoE genotype.
Conclusion: We successfully developed a well calibrated protein-only model with a 10-fold dynamic range that predicts 20-y risk of a dementia diagnosis. Performance of the protein model out-performs leading genetic risk factors for dementia and given the exclusion of immutable factors it has the potential to monitor changing risk.


Clare Paterson, PhD,
Amy Zhang, PhD,
Rachel Ostroff, PhD,
Yolanda Hagar, PhD,
Kelsey Loupy, PhD
Stephen Williams, MD, PhD
SomaLogic, Boulder, CO, USA

Share with colleagues

More posters

PosterLatest research shows benefit of non-invasive, high-plex protein profiling for liver disease

Learn about using high-plex, aptamer-based protein profiling for NASH research through these three SomaLogic assets. Watch one webinar and download two posters that each highlight NASH research.

Learn more

PosterProteomic Indicators of Metabolic Health in Diabetes and Social Deprivation

Understanding the health impacts of socioeconomic deprivation (SED) and its interaction with type 2 diabetes is important for patient care and effective public health initiatives. Large-scale proteomic profiling using aptamer-based technology to measure 7,000 proteins has facilitated the development of blood-based proteomic signatures for 11 cardiometabolic SomaSignalTM Tests (SST)

Learn more

PosterHeritability, pQTLs, and environmental influence on proteins involved in age, cardiovascular risk, and glucose tolerance using the SomaScan® Assay

Protein quantitative trait locus (“pQTL”) studies identify genetic variants that are statistically associated with protein levels. Results from the growing number of pQTL studies can be combined with genome-wide association studies to identify proteins that underlie the genetic risk of disease, thus revealing the mechanisms of disease and potential drug targets.

Learn more

Explore posters in our interactive viewer