OpenMS icon indicating copy to clipboard operation
OpenMS copied to clipboard

Problems with FeatureFinderIdentification on Bruker timsTOF data (with ion mobility)

Open hendrikweisser opened this issue 3 years ago • 6 comments

Reported by Jean-Michel Camadro and Laurent Lignières. In their test run, only about a third of the identified peptides could be quantified using FeatureFinderIdentification (FFId). In addition, the same number of features were detected on the whole mzML file and on different RT slices of it.

The mzML input file was generated from Bruker raw data using "Data Analysis" (Bruker software?). Notably, MS2 scans were mixed spectra from several precursors with different m/z values (and different ion mobilities). The idXML input file was generated using IDFileConverter from X! Tandem search results (XML format) of the raw data.

hendrikweisser avatar Apr 14 '22 12:04 hendrikweisser

the same number of features were detected on the whole mzML file and on different RT slices of it

It turns out the reason for this is missing RT information in the idXML file. This leads to chromatogram extraction and feature detection over the whole RT range in FFId. For peptide IDs with appropriate isotope m/z values, a feature could be found in each of the different RT slices.

The solution should be to add the RT information during conversion from X! Tandem XML to idXML in IDFileConverter (look-up via spectrum references using parameter "mz_file"), but currently this doesn't work. The error message is: Error: Unexpected internal error (Could not convert non-integer DataValue to int) The cause is that the (outdated?) X! Tandem XML conversion code looks for a meta value "spectrum_id" (should be "spectrum_reference") and isn't aware of the format X! Tandem uses for spectrum references. See: https://github.com/OpenMS/OpenMS/blob/df8d8f5bff67340bf7b26d091e56f9c205e591d1/src/topp/IDFileConverter.cpp#L527

hendrikweisser avatar Apr 14 '22 12:04 hendrikweisser

Could this also be integrated into XTandemFilereader class? Because then it would be available in the XTandemAdapter and pyopenms as well. Otherwise, we create new/different workarounds every time we deal with Xtandem data.

jpfeuffer avatar Apr 14 '22 13:04 jpfeuffer

Could this also be integrated into XTandemFilereader class?

Do you mean XTandemXMLFile? I guess it could, by passing an optional SpectrumMetaDataLookup parameter to load (analogous to PepXMLFile::load). The way I've now implemented it is as post-processing in IDFileConverter, though. See changes in my "idfileconverter-xtandem" branch: https://github.com/OpenMS/OpenMS/compare/develop...hendrikweisser:idfileconverter-xtandem?expand=1

hendrikweisser avatar Apr 14 '22 15:04 hendrikweisser

The solution should be to add the RT information during conversion from X! Tandem XML to idXML in IDFileConverter (look-up via spectrum references using parameter "mz_file")

I've implemented this now, but testing reveals that the scan numbers don't match - scan numbers in the search results go up to 10x higher than the number of spectra in the mzML. I guess the analysis software must de-multiplex the MS2 spectra before running X! Tandem, resulting in many more spectra being searched, but this makes it impossible to map the search results to spectra in the mzML.

hendrikweisser avatar Apr 14 '22 16:04 hendrikweisser

Hmm yes, sounds bad. Unless the analysis software uses a fixed binning approach and you can use the number of IM bins to map back. Or raise an issue at the upstream software that this information should be passed through.

jpfeuffer avatar Apr 14 '22 18:04 jpfeuffer

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Dec 01 '25 03:12 github-actions[bot]