Short description:
It is often quite difficult to do data screening for metabolomics involving kinetic studies. The data size tends to be huge and the automation of some sort of pre-screening becomes a must while trying to find significant biomarkers. An additional complication arises when when the kinetics of the molecule is not known (i.e. the pace at which each molecule is metabolised in the body is unknown), and we do not want to discard potential biomarkers when they do not accomodate to a (parametric) kinetic model. In this case, the user normally has to deal with the typical sensitivily-selectivity trade-off: a too sensitive method will raise too many false alarms (i.e. too many biomarkers are wronlgy identified as potentially relevant); whereas a too selective method raises too many false negatives (i.e. we tend to miss too many important biomarkers).
In this project we were looking to smart methods to get away of this trade-off. We proposed a collection of pre-screening methods. With some care, selectivity and sensitivity can be adjusted to discard a vast majority of the data without too many false negatives. The way the method works depends on the situation [30a, 34a]. In one way or another, at its core the method is based on evaluating the autocorrelaton of the time series. The reason is simple: when the data is collected on a subject at different time intervals, any relevant biomarker should show a smooth variation as a function of time. As mentioned above, this is extremely useful when we don’t know the kinetics of the intervention. We applied this idea to GC-MS data sampled at different time intervals obtained after a certain dose treatment [30a]. In a later extension we develop a multivariate method able to deal with data containing multiple doses, different individuals, and collected at different time points [34a]. The methods were applied to different projects with Unilever.
Credits:
This project was developed at different institutions. Several people were involved. Look into presentations co-authorship for more information.
Sponsors:
Unilever, University of Amsterdam.
Presentations:
None available.
Presentations:
None available.
Tags:
- Application domain: Food, Pharma & Health Sciences
- Instrument domain: GC, MS
- Statistics domain: Multivariate exploratory analysis, Signal processing