xcms icon indicating copy to clipboard operation
xcms copied to clipboard

Postprocessing after chromatographic peak detection

Open jorainer opened this issue 6 years ago • 4 comments

After chromatographic peak detection I find myself frequently looking through identified peaks to see whether they make sense or are just noise (I guess that sounds familiar to most people). Things I frequently encounter are:

  • signal from an ion that got split into two separate peaks.
  • streches of neighboring peaks, i.e. noisy signal (we're using HILIC) that results in rather long streches of peak-like signal. For these centWave detects usual several peaks every now and then.

What I would propose is to implement a cleanChromPeaks function that could be called after findChromPeaks to clean up messy signal or refine identified peaks. It's signature could be findChromPeaks(object = "XCMSnExp", param) with param being a parameter object defining the settings for a specific cleaning algorithm. Examples could be:

  • JoinNeighboringPeaksParam: join chromatographic peaks if their m/z range is overlapping and if they are not more than x seconds apart. To avoid joining isomers, skip peaks for which the intensity at rtmax is less than x % of the peak's apex.
  • CleanBroadPeaksParam: remove chromatographic peaks that are wider than x seconds.

Other implementations could follow. They all should take an XCMSnExp and a parameter object as input and return an XCMSnExp (with cleaned/improved chromatographic peaks).

I guess other people might also have similar utility functions already implemented (@stanstrup , @michaelwitting ?) that could be added too.

jorainer avatar Sep 30 '19 06:09 jorainer

Good idea. Would be good if it was possible to mark them also. Or in some way inspect that it makes sense what it is doing. Wouldn't the joining now be implicitly done by group?

The only code I have is something runs through all the picked m/z values and tries to guess if the m/z is a contaminant. So possible contaminant if intensity is higher than X for more than Y min. Then for all features within some ppm of a detected contaminant mark the feature as possible contaminant.

stanstrup avatar Sep 30 '19 07:09 stanstrup

to mark them also.

Should be doable, since we have now the chromPeakData DataFrame that allows to add arbitrary annotations to a chromatographic peak.

joining now be implicitly done by group?

In principle yes, but that does depend on the bw parameter - I'd like to do it before the correspondence.

Contaminant detection sounds like a great addition! This is what a combination of my above proposed methods could also do (first joining neighboring peaks and then removing stuff that is too long). Alternative approaches obviously highly welcome!

jorainer avatar Sep 30 '19 08:09 jorainer

I have long been removing exceptionally wide chromatographic peaks after feature detection but before correspondance. Old XCMS data structure:
if(filtPeaks) { orig<-xset@peaks good<-which((orig[,"rtmax"]-orig[,"rtmin"])<(3*maxpw)) filt<-orig[good,] xset@peaks<-filt }

It would be great to make that a base function in XCMS.

cbroeckl avatar Nov 01 '19 22:11 cbroeckl

Functionality is now in (master branch) - if someone want to try

jorainer avatar Nov 15 '19 09:11 jorainer