xcms
xcms copied to clipboard
Add function to reconstruct MS2 spectrum
For issue #375 we need a function to reconstruct the MS2 spectrum from the list of Chromatogram
objects. Can you add that @michaelwitting ?
Can do that. The other way round, can you get a function to correlate Chromatograms
that is fast (based on biocParallel
?)
May idea would be to take a matrix of correlation values and everything above treshold
goes into the MS2 spectrum. Intensities should be which ones? into
?
Re correlation of Chromatograms
, yes, can do that.
Regarding the reconstruction - I didn't yet think of the best way. into
sounds good, but might be nice if we could also use a different column instead. Could the function take a chromPeak
matrix as input and return a Spectrum2
of that? mz
being the "mz"
column and intensity configurable by a parameter?
That was the plan, so: Yes. Do we have the chromPeaks
already with MS2 levels and pockets?
I will think about a function... Maybe not today, but until Sunday, when I'm in the plane to Greece.
After aligning and correlation we will have a chromPeaks
matrix containing only the (MS2) peaks for one MS1 chrom peak that pass the correlation criteria. So it should be straight forward I think.
Question @michaelwitting : do we expect that the MS2 signal for an ion is higher than the corresponding MS1 signal? The plot below shows the MS1 chromatogram in black (thick line) and all MS2 chromatograms for that m/z and retention time in light grey/blue if their correlation is > 0.8.
Should we add an additional criteria to check that, apart from the correlation coefficient, also the max intensity of the MS2 is <= the max intensity of the MS1?
Yes, fragment intensities can be higher then the original one. I would say we only use the correlation for the moment and return a Spectrum2
object that can be then cleaned by additional means.
OK 👍
One idea is to highjack the CAMERA
approach: We can check if peak intensties not only correlate along RT but also along samples. If we have higher intensities in one sample for the precursor, the fragments should follow. That could maybe remove some false positive that might co-elute.
@sneumann We would adapt functions from CAMERA
for that, which would go to XCMS
and can be reused by in CAMERA
.
What do you think?
No problem recycling CAMERA code.
If xcms exports the migrated functions, we can have CAMERA switch to them,
and check dependency xcms >= 3.x.y
. Please open CAMERA issues for each
function that shall be migrated.
Yours, Steffem
I see the point @michaelwitting - only that we are processing the data at present separately for each file (to enable parallel processing). What if we add this as a postprocessing step to clean reconstructed spectra? Also because this would require the definition of the features in order to know which chromatographic peak in sample 1 goes along with a chromatographic peak from sample 2.
We can do it post chromatogram alignment and correlation. Would be good to have some "annotation" with the reconstructed spectrum. Maybe a matrix with the correlation values? This matrix could be enriched with other values that might be used for filtering.
Yes, that (annotation for reconstructed spectrum) would/should be doable.
If possible I would like to keep the definition of the MS2 spectrum separate from the quality assessment of the reconstructed spectra. Also, because we could use/reuse this logic also for other settings. So, ideally, I would like to have:
-
reconstructChromPeakSpectra
: reconstructs MS2 spectra for each chromatographic peak in a file. Does not need features to be defined and can be called directly after chromatographic peak detection. - function that takes a
Spectra
(with some additional information) and runs quality checks (like correlation of peak across samples). We might even want to re-use this tool then to define the representative MS2 spectrum for a feature for the GNPS stuff, i.e. select the MS2 spectrum for which the intensities of the peaks best follow the intensity of the precursor (chrom peak). -
reconstructFeatureSpectra
: this could use the function above to clean/purge the reconstructed MS2 spectra for each chromatographic peak associated with the spectrum and return a single higher confidence MS2 spectrum. This function can only be called if features are defined.
What do you think @michaelwitting ? Another benefit is that the code will stay cleaner/simpler to maintain I believe.
Sounds good! Let's keep it separate!
So, for each MS1 chrom peak we can get the MS2 chrom peaks with a correlation higher than a certain threshold. From that we can reconstruct the MS2 Spectrum.
We can use the "mz"
values of the chrom peaks as m/z values of the spectrum, but should we use the "maxo"
or the "into"
for the intensities? I'd go for the "maxo"
, what do you think @michaelwitting @sneumann ?
I would also go for maxo
, but maybe we should keep the freedom to define what we would like to use by having a parameter in the function?
Summarizing, the reconstructChromSpectra
is somewhat similar to MS-DIAL's MS2Dec
function. The reconstructFeatureSpectra
could in addition support a reconstruction similar to MS-DIAL's CorrDec
function since we do have then also the intensities across samples.