DiaNN icon indicating copy to clipboard operation
DiaNN copied to clipboard

Different protein group for same peptide different charge state

Open Arthfael opened this issue 1 year ago • 3 comments

I noted something strange today after I was asked by a colleague to add a number of peptides column to her DiaNN pg matrix. I loaded the report.tsv in R, aggregated protein groups by modified sequence, and was surprised to find that a few PSMs with the same modified sequences but different charge state had been assigned different protein groups. Surely there is no good explanation for that, the observed charge state should not affect the expected assignment to a given protein group as long as modified sequence is the same.

This discrepancy concerned maybe 15 modified sequences in a dataset which contains > 100k, so it looks very minor, but this also raises the question as to whether those assignments which are to a single protein groups are correct. Indeed, I have noted for a while without investigating in details that there are a few discrepancies in the mappings of peptides to proteins accessions, compared with what I would normally expect. Of course it all depends on the protein grouping algorithm and on what one decides should be in those columns (all matches to any protein IDs, versus only matches to discovered proteins in the sense that they are leading proteins from a protein group), but still I think there may be some small issues in there too - though probably only minor ones.

I am happy to share any data to illustrate the point, although the files are rather large.

Arthfael avatar Nov 14 '23 08:11 Arthfael