xcms
xcms copied to clipboard
Orbitrap shoulder filter
Hi,
It would be nice to have build-in support to filter orbitrap artifacts at some point.
I have written a filter but ATM it needs to be used to convert and re-write the raw files.
https://github.com/stanstrup/chemhelper/blob/master/R/orbi.filter.R
EDIT: code now here: https://gitlab.com/R_packages/chemhelper/blob/master/R/orbi.filter.R
Perhaps MSnbase would now be a better place for this?
@lgatto , what would you think ? That sounds like a filter that can be applied upon reading data in ?
Sure, that sounds a sensible addition. I'll try to look a bit more into is this week. @stanstrup, could you provide a little more background/documentation. Feel also free to submit a PR directly.
Sure. In my package I have the functions orbifilter and xcmsRaw.orbifilter.
The orbifilter function takes a matrix with columns "mz" and "intensity" and returns a matrix with the filterered spectrum.
xcmsRaw.orbifilteris just a wrapper for xcmsRaw objects.
Do you need more documentation than is here?: https://github.com/stanstrup/chemhelper/blob/master/man/orbifilter.Rd
A PR will probably be difficult for me ATM since I haven't looked into the MSnbase data structure yet.
Bagground
The orbitrap shoulder peaks are an artifact of the Fourier transformation. But I have not found a paper that describes the problem in relation to orbitrap so I am not 100 % sure about the origin of the peaks. But my understanding/guess is the following: I believe the problem is the phenomenon know as ringing (http://en.wikipedia.org/wiki/Ringing_artifacts). Other people call them ripples (I think they are discussion the same thing here http://www.shimadzu.com/an/ftir/support/tips/letter15/apodization.html). I have also seen them called simply orbitrap artifacts. It is only visible in orbitrap data for very large peaks as small random mass peaks ~0.01 Da from the main peak. Since these artifacts don't usually appear nicely symmetric as you would think I assume many of them fall under some lowpass threshold during centroiding. But I guess it could also be that they are not symmetric in a complex signal. I don't know.
Have eventually a look at the smooth,Spectrum method in MSnbase. Since smoothing/filtering is done per spectrum you could do it there, ideally implementing the orbifiliter function for Spectrum objects (use their @mz and @intensity data).
OK. I will try to do that then.
Here's how I see it:
- a new
smoothmethod withmethod = "orbifilter* for individual MS1 spectra, that would call theorbifilter` function. - a new
smoothmethod withmethod = "orbifilter* for individualMSnExpobjects, that calls the spectum method. Note that this will be a delayed processing method forOnDiskMSnExp` objects. - update
smooth-methods.Rdto mention the filter. - add
orbifilter.Rdwith all the details.
I created the orbifilter branch (but not pushed anything yet) to explore and test this.
Any news on this? Would be nice to get this filter in. I think I am also seeing evidence of random noise in qtof data where this filter should help. The noise seems to skew the reported mz.
Bump. Was something implemented or can I do something to help this along? We have a big dataset again that suffers from this.
Would be cool if you could help out here @stanstrup !
Go ahead and feel free to ask me if something (implementation-wise) is unclear.
I have been trying to dig into the code and I ended up in the Spectra package where smooth is defined. There, what is calculated is some coefficients that are passed to .peaks_smooth which passes them to MsCoreUtils::smooth where it appears the coefficients are subset and passed to stats::filter...
So I have some questions as I am pretty lost...
- Is
smooththe right place for this as we are trying to remove ms peaks rather than modify their intensity? Alternatively we could requirefilterIntensityto be called afterwards. Or should it be a separate filter method? - I am having a hard time understanding what the coefficients matrix should look like. Probably because I miss some signal processing theory. The documentation for
stats::filterseems to assume this is known... So any idea if I can set coefficients in such a way that I set peaks to 0 or their original intensity? If this doesn't fit in as a filter I simply call my function if the method isorbifilter?
EDIT: Just for reference MzMine has "FTMS Shoulder peaks filter" that deals with this. They "draw" a Lorentzian peak around the main peak and kills everything inside, whereas mine just uses a rectangular box.
sorry for my late reply - busy times. Thinking it over it might indeed be something that should go directly into the Spectra package - from there we could move stuff then also to MSnbase if needed (e.g. for xcms).
And in Spectra there are two places where this could go (and which you could start playing around to see if it works as expected):
filterIntensity: allows to pass a function that performs peak filtering.addProcessingwhich can take any function that works on a peak matrix.
For 1) the function should be something like:
keep_intensity <- function(x, thresh = 20, ...) {
x > 20
}
the function should take an argument x which is a numeric with the intensitites of one spectrum and should return a logical, same length than x with TRUE to keep a peak or FALSE to remove it.
For 2)
the function should take a peak matrix as input and should return a peak matrix with the same dimension. Best to have a look at the peaks-functions.R file in Spectra where all these functions are implemented.
Regarding stats::filter - I also have no idea on that...
Any updates/progress on that @stanstrup ? @mar-garcia would also need this orbitrap filter ;)
Unfortunately no. I got away from it again. The project where we needed it needed to go ahead so we used my old method of loading the files with the old xcmsRaw and re-writing them.
I don't think we can use your option 1. We need both the mz and the intensities.
For option 2: if it has to be same dimensions how do we drop the peaks? Set intensities to 0and then filter 0s (AFAIR there is a function for that. Is the also a function for filtering NA intensities if such thing is allow). But it is not super elegant I guess to require two steps anyway... So I am still not sure about the best way to implement.