Proteomic models to predict pre-analytical variation
Background
In biomarker discovery, it is critical to assess any pre-analytical variation (PAV) in order to avoid artificial bias in the intended measurements. PAV may arise from both avoidable and unavoidable factors, resulting in misleading data and incorrect conclusions. Proteins, in particular, are vulnerable to variation in collection methods, storage temperatures, and processing protocols. It is vitally important to understand this PAV when analyzing samples using protein assays.
Methods
Human EDTA plasma and serum samples, subjected to standardized sample processing methods, with distinct excursions from ideal collection, were assayed on the SomaScan® Platform measuring ~7,000 analytes. Using machine-learning methods, these quantitative protein measurements were compared to sample processing truth standards (eg, time-to-spin) to create predictive models. These models, termed SomaSignal® tests (SSTs), were developed to enable the assessment of PAV related to processing methods.
Results
SomaSignal tests (SSTs) have been developed to predict time-to-spin, time-to-decant and time-to-freeze, reported in the number of hours, for both plasma and serum. Models that predict the number for freeze/thaws a sample has been subjected to, have also been developed. All eight models had Lin’s CCC and R2 values greater than 0.90 in hold-out validation datasets. In addition to these sample handling predictions, effect size calculations for all ~7,000 measurements have been determined for multiple time points, or freeze-thaw cycles, for each model.
Conclusions
SomaLogic has developed a unique class of PAV models that are able to assess variation related to processing methods. Results from these predictions can be used during biomarker evaluations to exclude samples due to apparent excessive delay in processing, identify collection site bias for current and future analysis and identify sample groupings that may impact analysis. Further, knowing the effect size metrics for all measurements could also enable the removal of specific analytes from modeling and/or be used as covariates in model development.
Authors
David Astling
Dan Drolet
Joe Gogain
Yolanda Hagar
Laura Sampson
Kaitlin Soucie
Kinsey Trinder
Ira von Carlowitz
Matthew Westacott
SomaLogic Operating Co., Inc., Boulder, CO USA
More posters
PosterThe Plasma Proteome as a Cardiovascular Disease Risk Assessment Tool in Cancer Survivors
Cardiovascular disease (CVD) is the most common non-cancer cause of death in cancer survivors and there is an unmet clinical need for easy, accurate, and safe CVD prognostic risk-stratification in adult cancer survivors. This study investigated whether a previously validated 27-plasma protein prognostic model for four-year cardiovascular (CV) events could have such a utility.
PosterEfficient development of prognostic tests for detecting cancer risk using proteomic technology
Prognostic models for assessing future health outcomes can be developed using time-to-event (also known as “survival”) data. This methodology is ubiquitous in statistical literature and in the analysis of cancer outcomes, but its use in high-dimensional analyses tends to be limited as the methods are difficult to implement in a machine learning environment. Additionally, development of certified prognostic clinical tests using proteomic biomarkers for detecting future cancer risk can be time-consuming, prone to overfitting issues, and difficult to navigate. We demonstrate the utility of combining SomaScan® proteomic data with pipeline machine learning tools and survival analysis methodology to identify powerful and robust LDT-certifiable prognostic tests for assessing future risk of cancer.
PosterPredicting risk of future events in individuals with chronic coronary syndromes
Evaluate whether a previously validated 27-protein prognostic model for four-year cardiovascular event risk can be used to stratify patients with suspected chronic coronary syndrome (CCS)