isotope pattern for MS/MS precursor
There are several programs which will take a paired isotope pattern 'spectrum' with each MS/MS spectrum. The idea behind this is that the isotope pattern can be of great value in determining molecular formula when your MS/MS data are not found using traditional spectral searching. Sirius and MSFinder are the two big programs that come to mind for this.
Each of them has their own custom export format - .ms and .mat for Sirius and MSFinder, respectively. There are probably other programs which have their one custom format. It would be great to have an export format to these popular programs. It would seem to me that one of the only limitations preventing something from being written from an R 'Spectra' object is that the 'Spectra' object does not, as far as i know, have any place for MS1 isotope pattern data.
Would it be possible to define such a data structure within the 'Spectrum' object? I can certainly hack something together (i am doing so already to support some RAMClustR developments) but given the formality and availability of tools in 'Spectra' format, it would be great to have the ability to store these isotope data in the Spectrum object itself in a predictable format so those data can be made available to any export format properly mapped.
Hi Corey, thanks for the comment!
Some thoughts from my side: what about storing/keeping the isotope pattern as a own Spectra object? you have essentially the m/z and probability values, which could be stored as the Spectra's m/z and intensity values. So you would have two Spectra with the same length, one with the MS/MS spectra and one with the related MS1 isotope spectra. That would keep the code and data structures easier - and (IMHO) more user friendly.
That's also part of our current workflows - so, for each chrom peak (after xcms preprocesing), get the MS2 spectra as well as the chromatographic peak's MS1 spectrum (using chromPeakSpectra() using method = "closest_rt" to get the MS1 spectrum closest to the peak apex). We use this MS1 spectrum to extract the potential/tentative isotope peaks using the MetaboCoreUtils::isotopologues() function (and using the chrom peak's m/z (i.e. the MS2 spectra's precursor m/z) as seedMz). We thus get a isotope spectrum for each - which we can then use to calculate similarity to the theoretical isotope pattern (given the chemical formula of the compound).
Note: For Sirius integration, we are working on the RuSirius package that should simplify sharing data and results directly with Sirius...
Thanks @jorainer - i was thinking that having them directly associated in the spectrum would be cleaner, but i defer to your expertise. I have been handling parallel structures in RAMClustR forever already, so this isn't a problem on my part. I will follow the precedent!
i will take a look at the RuSirius vignettes! that said it is still useful to be able to export spectra to various formats. do you use the various 'backend' packages for export as well as import? would be be of value to have a .ms and/or .mat (etc) backend? Honestly, this was the notion that prompted the question. it would seem that if there were backend packages for export, that having all the data in a single spectra object would be cleaner.
Many backends indeed support also exporting data (such as the MsBackendMgf, MsBackenMsp, MsBackendMzR etc). The .ms and .mat files, are they from Sirius? If I remember correctly @michaelwitting once wanted to write a backend for these file types. You can also point us to the definition of these file formats and we can see if we can implement a backend. All this aligns also very well with our current developments to increase the interoperability between the softwares :)
I orginally wanted to write something for .ms, but time constraints didn't allow it so far. There is some definition of the .ms for Sirius here: https://boecker-lab.github.io/docs.sirius.github.io/io/#input
From my discussion with Sirius people, the .ms file is not super user friendly, and on their documentation website their mention it needs update anyway. i would not focus on that.
The output of Sirius is quite complex (one spectrum = multiple formula, one formula = multiple structure).
For now RuSirius allows you to export table of results and go through the Sirius GUI within R but not much more in term of handling results. I have not figured out how to deal with such complexity. User feedback would be very helpful on what would be needed to make it user friendly :)
@philouail - if you figure this out, well, metabolomics becomes easy....
The GUI for Sirius is a valuable component, especially for users who are not into spreadsheets. Will it be possible to get a .sirius file so that the results can be browsed in the native Sirius GUI?
the .sirius project file gets saved automatically so it can then be opened in Sirius. But you can use the native Sirius GUI directly into R using the openGUI() function.
that sounds fantastic, and i am not sure it is easy to do much more than that. The Sirius GUI has a good deal of functionality for interactice sessions. The data frame/spreadsheet output is a great record, and also really valuable. Unless you can devise a method to tell users whether a given best match is 'correct' i have a hard time requesting additional functionality beyond what you have described.