Efficient development of certified diagnostic laboratory developed tests using proteomic data

Background

Proteomic technology is a powerful biological tool with established methods for identifying proteomic biomarkers, but the development of certified diagnostic clinical tests based on proteomic biomarkers can be time-consuming, prone to overfitting issues, and difficult to navigate. We demonstrate the utility of combining pipeline tools, statistical learning techniques, and a knowledge base of in-silico proteomic datasets into a reproducible workflow that allows for efficient development of LDT-certifiable tests using SomaScan® technology.

Methods

Data pipeline and analysis tools were developed using R, in conjunction with proteomic measurements obtained using the SomaScan Platform. The tools take the analyst from data processing and QC through identification of optimized models for prediction of clinical endpoints, and then through validation on a hold-out test set. The tools include an assessment of model robustness against sample handling issues, longitudinal stability, the impacts of assay noise on model performance, effects of putative interferents, and risk of failure during CLIA validation in the lab. Real-life examples of clinical applications demonstrate the effectiveness of the tool in reducing analysis time and increasing model accuracy.

Results

Analysis time for identifying the optimal proteomic model to validation was reduced by at least 80%, with decreased prediction variability by up to 90%. In at least 75% of cases, application of in-silico data allows for tuning of predictive models to ensure robustness in a variety of everyday settings. This tool has led to 16 LDT certified SomaLogic tests in the last 3 years, ranging from anthropometric
measurements to cardiovascular- and cancer-risk predictions.

Conclusions

Not only are powerful, proteomics-driven, diagnostic tests realizable, but they can be LDT certified in an efficient, reproducible manner and made to be robust to real-life variability. Efficient analysis tools allow us to leverage proteomic technology in new ways, leading to tests that can be used for precision medicine applications.

Authors

Y. Hagar
L.E. Alexander
C. Scheidel
A. Zhang
J. Gogain
C. Paterson
R. Ostroff
M.A. Hinterberg

SomaLogic Operating Co., Inc., Boulder, CO, USA


Share with colleagues

More posters

PosterAptamer-based analysis of plasma proteome of growing tumors 

With proteins, the presence of a tumor is more often accompanied with changes in the levels of endogenous, unmutated proteins in circulation. In this context, knowing which proteins represent the earliest markers or tumor presence would be enormously useful.

Learn more

PosterPrognostic proteomic models for low event rates: A case study with myocardial infarction

We have developed and assessed a novel prognostic model development method combining two statistical techniques – survival analysis and subsampling – using existing machine learning tools in R.

Learn more

PosterProteomic Models to Predict Pre-Analytical Variation

In biomarker discovery, it is critical to assess any pre-analytical variation (PAV) in order to avoid artificial bias in the intended measurements. PAV may arise from both avoidable and unavoidable factors, resulting in misleading data and incorrect conclusions. Proteins, in particular, are vulnerable to variation in collection methods, storage temperatures, and processing protocols. It is vitally important to understand this PAV when analyzing samples using protein assays.

Learn more

Explore posters in our interactive viewer