MetaMorpheus icon indicating copy to clipboard operation
MetaMorpheus copied to clipboard

Unintuitive proteoform/protein output

Open zrolfs opened this issue 3 years ago • 1 comments

image

I made a file that contained 8 PrSMs at each of the proteoform ambiguity classification levels (1, 2A, 2B, 2C, 2D, 3, 4, and 5). 8 PrSMs were reported in the PrSM output. 3 proteoforms were reported in the proteoform output. 6 proteins were reported in the ProteinGroup output.

It's a little weird to me that we only reported 3 unique proteoforms, even though we identified 8 unique proteoforms. Stranger still is the ability to identify 6 unique protein groups from only 3 unique proteoforms.

The reason for this is because the Peptide/Proteoform output requires an unambiguous full sequence and the ProteinGroup output requires an unambiguous base sequence for parsimony.

I'm not sure how pressing this issue is (or if it's even an issue), but it doesn't look like a quick fix.

zrolfs avatar May 27 '21 17:05 zrolfs

I'm thinking about proteoform parsimony... Saw we identified two PrSMs: A) PROTEOFORM (with unlocalized +16 mass shift) B) PROTEOFORM(Ox)

We should output a single proteoform "PROTEOFORM(Ox)" for the two, rather than reporting both.

zrolfs avatar Jun 07 '21 13:06 zrolfs