Efficient development of prognostic tests for detecting cancer risk using proteomic technology
Background
Prognostic models for assessing future health outcomes can be developed using time-to-event (also known as “survival”) data. This methodology is ubiquitous in statistical literature and in the analysis of cancer outcomes, but its use in high-dimensional analyses tends to be limited as the methods are difficult to implement in a machine learning environment. Additionally, development of certified prognostic clinical tests using proteomic biomarkers for detecting future cancer risk can be time-consuming, prone to overfitting issues, and difficult to navigate. We demonstrate the utility of combining SomaScan® proteomic data with pipeline machine learning tools and survival analysis methodology to identify powerful and robust LDT-certifiable prognostic tests for assessing future risk of cancer.
Methods
Data pipeline and analysis tools were developed using R. In addition to standard machine learning techniques, statistical methods include elastic net AFT models, subsampling survival techniques, and metrics for assessing predictive survival models. The pipeline takes the analyst from data processing and QC through identification of optimal models for prediction of clinical endpoints, and then through validation on a hold-out test set.
Results
Analysis time for identifying the optimal proteomic model to validation was reduced by at least 80%, with decreased prediction variability by up to 90%. This tool has led to 7 LDT certified SomaLogic prognostic tests (using survival methodology) in the last 3 years.
Conclusions
Not only are powerful, proteomics-driven, diagnostic tests realizable, but they can be LDT certified in an efficient, reproducible manner and made to be robust to real-life variability. Efficient analysis tools allow us to leverage proteomic technology in new ways, leading to tests that can be used for precision medicine applications.
Authors
Yolanda Hagar
Leigh Alexander
Jessica Chadwick
Gargi Datta
Joe Gogain
Rachel Ostroff
Clare Paterson
Laura Sampson
Caleb Scheidel
Sama Shrestha
Chi Zhang
Michael Hinterberg
SomaLogic Operating Co., Inc., Boulder, CO USA
More posters
PosterThe Plasma Proteome as a Cardiovascular Disease Risk Assessment Tool in Cancer Survivors
Cardiovascular disease (CVD) is the most common non-cancer cause of death in cancer survivors and there is an unmet clinical need for easy, accurate, and safe CVD prognostic risk-stratification in adult cancer survivors. This study investigated whether a previously validated 27-plasma protein prognostic model for four-year cardiovascular (CV) events could have such a utility.
PosterPredicting risk of future events in individuals with chronic coronary syndromes
Evaluate whether a previously validated 27-protein prognostic model for four-year cardiovascular event risk can be used to stratify patients with suspected chronic coronary syndrome (CCS)
PosterUtility of proteomic trajectories of cardiovascular risk and cardiorespiratory fitness to monitor adverse health states throughout post-COVID-19 illness
Cardiovascular involvement is a prominent observation in patients during the acute phase of COVID-19 infection, as well as in convalescence. However, the etiology, trajectory, and underlying biology of cardiac dysfunction across the spectrum of COVID-19 illness is not fully understood. To address this, the CISCO-19 study (NCT04403607) was formed to investigate the multisystem effects of COVID-19 from hospitalized patients