MSFragger icon indicating copy to clipboard operation
MSFragger copied to clipboard

Matched ions

Open kevinkovalchik opened this issue 4 years ago • 6 comments

Hello,

I am interested in getting lists of of matched ions for each PSM as part of the search results. For example, something like:

matched_y_ions="y1+=175.118952913371, y3+=361.18301123237103, etc...", matched_b_ions="..."

I do not see anything in the parameters that looks like this. Is there a switch for this that I am not seeing?

Thanks!

Kevin

kevinkovalchik avatar Oct 15 '20 20:10 kevinkovalchik

Hi Kevin, Reporting a list of specific ions matched for each PSM is not a feature in MSFragger at this time. You can see the total number of matched ions for each PSM in the pepXML output (the tag to look for is "num_matched_ions"), but not which ions make up the total. I think the pepXML schema does not support reporting a detailed list of matched ions (but I could be wrong).

dpolasky avatar Oct 15 '20 20:10 dpolasky

BTW, if you want to know which peaks could be matched, you can use PDV to plot the annotated spectra.

Best,

Fengchao

fcyu avatar Oct 15 '20 21:10 fcyu

Thanks, Fengchao and Daniel. I know Comet will export matched ions, but perhaps their format is not standard pepXML? It has been a while since I looked at it. Anyway, I will either get the ions out myself or use Comet to do it.

Thanks for pointing me to PDV. It isn't useful for this particular case because I am pulling out the matched ions from hundreds of runs, but it looks like a very nice tool and I was unaware of it.

Best wishes, Kevin

On Thu, Oct 15, 2020, 5:30 PM Fengchao, [email protected] wrote:

BTW, if you want to know which peaks could be matched, you can use PDV https://github.com/wenbostar/PDV to plot the annotated spectra.

Best,

Fengchao

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Nesvilab/MSFragger/issues/106#issuecomment-709601193, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD2PTUGTTZW7NXC4X7H27B3SK5SV5ANCNFSM4SSOA5VA .

kevinkovalchik avatar Oct 16 '20 00:10 kevinkovalchik

Hi Kevin,

Yes, Comet can output matched ions only for .out files. As far as I know, pepxml doesn't have any field for this information.

Best,

Fengchao

fcyu avatar Oct 16 '20 13:10 fcyu

Hello,

Are there any plans to implement this feature in the future?

Thanks, Evgeniy.

EvgeniyReznik avatar Dec 24 '23 13:12 EvgeniyReznik

Yes.

Best,

Fengchao

fcyu avatar Dec 24 '23 15:12 fcyu

We are also highly interested in this and more features !!

It would also be very nice to get the matched m/z error for each fragment ion too. We are switching from MQ atm, and this is one of the very useful columns in MQ.

It's a helpful qc / optimisation metric for the mass analyser and for creating all sorts of spectral visualisations.

Is there also a way to add percolator score and q value for each psm? At the moment it's not possible from the xml alone to get to the final outcome because the percolator information is missing.

Thanks & Best Flo

FloBay avatar May 13 '24 21:05 FloBay

Is there also a way to add percolator score and q value for each psm? At the moment it's not possible from the xml alone to get to the final outcome because the percolator information is missing.

Which xml file were you referring to? interact-*.pep.xml has the probability which is 1 - Percolator PEP.

As to the q-value, the results are not/should not be filtered with a single PSM-level q-value. FragPipe applied sequential or 2-D filtering combining PSM- and protein-level FDR. I guess you'd probably better to use the psm.tsv file.

Best,

Fengchao

fcyu avatar May 13 '24 22:05 fcyu

Thanks for the fast reply...

Which xml file were you referring to? interact-*.pep.xml has the probability which is 1 - Percolator PEP.

Thanks for pointing to the other xml file. I only checked the *.pepXML but not the interact-*.pep.xml. Do you have a reference file somewhere explaining the columns (e.g. ntt, nmc)? So the 'peptideprophet_probability' is actually 1 - Percolator PEP, when run percolator? I cannot find another probability column in this file.

As to the q-value, the results are not/should not be filtered with a single PSM-level q-value. FragPipe applied sequential or 2-D filtering combining PSM- and protein-level FDR. I guess you'd probably better to use the psm.tsv file.

I totally agree for standard usage. But, we were playing a bit around (e.g. Plot target decoy distribution of scores, mass errors, etc...) while optimising a new MS + search engines atm. And we couldn't find all the values we were interested in the psm.tsv, that's why I started to look at other places too. Yes, one can put things again together from the Percolator output file (if not deleted) but for the not so programming experienced colleagues its not that straight forward. Thus would be cool if the PSM_Ranks, Percolator Score, Percolator PEP that was actually used would appear in the psm.tsv file eventually.

FloBay avatar May 14 '24 00:05 FloBay

Do you have a reference file somewhere explaining the columns (e.g. ntt, nmc)?

Here is the tutorial (https://fragpipe.nesvilab.org/docs/tutorial_fragpipe_outputs.html#psmtsv), but it is a little outdated. I will update it in the following weeks.

So the 'peptideprophet_probability' is actually 1 - Percolator PEP, when run percolator? I cannot find another probability column in this file.

That's correct, PeptideProphet Probability is the PSM probability. When using PeptideProphet, it is PeptideProphet probability. When using Percolator, it is 1 - Percolator PEP. We will change the column name to Probability in the next release.

Thus would be cool if the PSM_Ranks, Percolator Score, Percolator PEP that was actually used would appear in the psm.tsv file eventually.

ranks and Percolator PEP are already (kind of) in the psm.tsv, but the Percolator score is not. We will discuss internally to see if it is necessary to propagate this information, since it is not very useful to general users.

Best,

Fengchao

fcyu avatar May 14 '24 00:05 fcyu

Similar to https://github.com/Nesvilab/FragPipe/issues/885

I have implemented the feature to report the matched fragments. Let MSFragger generate the tsv report will have those two new columns. If you want to try the pre-released version, please send an email to yufeATumich.edu (replace AT with @).

Currently, there are some limitations

The reported fragments are not exactly the one used in the scoring because MSFragger does not record such information (to be fast). I let it "re-match" the fragments when generating the report. The fragments are deisotoped by default, so the reported matches only have singly charged. Disabling the deisotoping in the database search will have doubly charged fragments being matched, but the sensitivity is sub-optimal. It only supports closed search now, but I will extend it to open and mass-offset search in the future. Best,

Fengchao

fcyu avatar Jun 21 '24 20:06 fcyu