Bayesian approach for feature detection

Short description:

In other projects, I have been discussing the importance of setting up the correct algorithms for peak detection (in one-dimensional chromatography and two-dimensional chromatography). The majority of methods developed for chromatography rely on peak detection.

All peak detection methods are threshold based. In other words: the user has to set some threshold that is later on used by the algorithm to decide whether a peak-looking-shape is a peak or not. Consequently, the algorithm delivers a binary answer: “there is peak” or “there is no peak”. This way of thinking becomes a problem in certain application domains. This is because sometimes we want to consider the probability that a certain feature is a peak, instead of just getting a “yes/no” answer. This would allow us to keep all possibilities open, opposed to discarding too early features (possible peaks) that could be important for our study. In other areas (e.g. peak deteciton in two-dimensional chromatography) it might happen that we are confronted with more than two answers. Instead of having the “there is a peak”/”there is no peak” dilemma, we are confronted with having to assing an unknown (modulated) peak x to several peak clusters (A, B, C… etc.). As a consequence, in this latter case, we would ideally want to inspect the probabilities of all these propositions.

Bayesian statistics offers an elegant solution to the problem above. Instead of delivering a binary answer (in the case of peak detection in one-dimensional chromatography) or the most likely configuration (in the case of peak detection in two-dimensional chromatography), we are able to calcualte the probabilities of each configuration. As a consequence, by just keeping all answers in mind (but ranked by their probability), we assure that we are not discarding too early some of the possible answers. The idea of Bayesian approach to peak detection in two-dimensional chromatography is already public [35a] and has been applied to food analysis and forensics. Other ideas are on the move, and I hope they will be public soon.

Credits:

This project was developed at University of Amsterdam. Several people were involved. Look into presentations co-authorship for more information.

Sponsors:

University of Amsterdam (other sponsors are joining in later phases of this project)

Presentations:

See my presentation at HPLC-2013 (Amsterdam).

Software:

None available

Tecnometrix

A data-analysis company