psi-ms-CV icon indicating copy to clipboard operation
psi-ms-CV copied to clipboard

Add a `scan number` term

Open fcyu opened this issue 4 months ago • 8 comments

Describe the new term or terms you would like to add.

For a long time, several search engines, including MSFragger and Comet (correct me if I am wrong), use scan number to identify/index a scan/spectrum. Using a single integer is faster and memory more efficient compared to using a string.

As far as I know, there are two major approaches to get a scan number from a spectrum's metadata

  1. index + 1
  2. Extract the scan number from the id. For example, given a Thermo id controllerType=0 controllerNumber=1 scan=5, the scan number is 5

The first approach has issues: if the mzML file is a subset of the original mzML, the index is re-assigned starting from 0, which makes the scan number different from those in the original mzML. A typical example is that, in FragPipe, MSFragger generates _(un)calibrated.mzML files only containing MS2 scans, for downstream tools to use. There would be problems if using index+1 as the scan number.

The second approach works well for most Thermo data because the spectrum id is "1-D": only the scan changes in the controllerType=0 controllerNumber=1 scan=N format. We extract the N as the scan number. But for the data from some other venders, this approach doesn't work because there are multiple fields changing in the spectrum id. For example, function=2 process=0 scan=1: both function and scan change from scan to scan. Due to this reason, we discontinue the support of Waters and SCIEX data.

Recently, several users (e.g., https://github.com/Nesvilab/MSFragger/issues/324 and https://x.com/michaellazear/status/1782905716896100437) request us to bring the support back. It would make the life much easier if those data index the scans using 1-D schema. Thus, I proposed to add a scan number term to be used by mzML, pepXML, and other XML-based files.

I hope my explanations are clear. Let me know if you have any questions.

Best,

Fengchao

fcyu avatar Oct 07 '24 00:10 fcyu