quantms icon indicating copy to clipboard operation
quantms copied to clipboard

the charge min and max and missclevages are sometimes not working

Open ypriverol opened this issue 1 year ago • 3 comments

Description of the bug

@jpfeuffer @timosachsenberg @daichengxin I found one dataset that we search using msgf, here the command:

#!/bin/bash -euo pipefail
MSGFPlusAdapter \
    -protocol automatic \
    -in 01086_C01_P010738_S00_N03_R1.mzML \
    -out 01086_C01_P010738_S00_N03_R1_msgf.idXML \
    -executable $(find /usr/local/share/msgf_plus-*/MSGFPlus.jar -maxdepth 0) \
    -threads 6 \
    -java_memory 30720 \
    -database "GRCh38r110_GCA97s_coding_proteins_19Jul23-decoy.fa" \
    -instrument high_res \
    -matches_per_spec 1 \
    -min_precursor_charge 2 \
    -max_precursor_charge 4 \
    -min_peptide_length 6 \
    -max_peptide_length 40 \
    -max_missed_cleavages 2 \
    -isotope_error_range 0,1 \
    -enzyme "Trypsin/P" \
    -tryptic fully \
    -precursor_mass_tolerance 40.0 \
    -precursor_error_units ppm \
    -fixed_modifications 'Carbamidomethyl (C)' \
    -variable_modifications 'Acetyl (Protein N-term)' 'Deamidated (N)' 'Deamidated (Q)' 'Oxidation (M)' \
    -max_mods 3 \
    -PeptideIndexing:IL_equivalent \
    -PeptideIndexing:unmatched_action warn \
    -debug 0 \
     \
    2>&1 | tee 01086_C01_P010738_S00_N03_R1_msgf.log

However in the file output I found the following id:

<PeptideIdentification score_type="SpecEValue" higher_score_better="false" significance_threshold="0.0" MZ="664.68194580078125" RT="33
78.397500000000036" spectrum_reference="controllerType=0 controllerNumber=1 scan=24975" >
			<PeptideHit score="1.4043417e-21" sequence="INNAHTIGC(Carbamidomethyl)NAVSWAPAVVPGSLIDHPSGQKPNYIKR" charge="6" aa_before="K K 
K K K K K K K K K K K K K" aa_after="F F F F F F F F F F F F F F F" start="130 147 144 130 190 130 147 144 130 190 130 147 144 130 190" end="166 183 1
80 166 226 166 183 180 166 226 166 183 180 166 226" protein_refs="PH_14293 PH_14294 PH_14295 PH_14296 PH_14297 PH_44721 PH_44722 PH_44723 PH_44724 PH_
44725 PH_112619 PH_112620 PH_112621 PH_112622 PH_112623" >
				<UserParam type="float" name="MS:1002049" value="103.0"/>
				<UserParam type="float" name="MS:1002050" value="165.0"/>
				<UserParam type="float" name="MS:1002052" value="1.4043417e-21"/>
				<UserParam type="float" name="MS:1002053" value="6.614773000000001e-14"/>
				<UserParam type="string" name="AssumedDissociationMethod" value="HCD"/>
				<UserParam type="string" name="CTermIonCurrentRatio" value="0.3437819"/>
				<UserParam type="string" name="ExplainedIonCurrentRatio" value="0.39947474"/>
				<UserParam type="string" name="MS2IonCurrent" value="2429519.8"/>
				<UserParam type="string" name="MeanErrorAll" value="4.888304"/>
				<UserParam type="string" name="MeanErrorTop7" value="2.5796666"/>
				<UserParam type="string" name="MeanRelErrorAll" value="-0.8928608"/>
				<UserParam type="string" name="MeanRelErrorTop7" value="2.5497687"/>
				<UserParam type="string" name="NTermIonCurrentRatio" value="0.055692848"/>
				<UserParam type="string" name="NumMatchedMainIons" value="23"/>
				<UserParam type="string" name="StdevErrorAll" value="4.698519"/>
				<UserParam type="string" name="StdevErrorTop7" value="1.8443376"/>
				<UserParam type="string" name="StdevRelErrorAll" value="6.7211905"/>
				<UserParam type="string" name="StdevRelErrorTop7" value="1.885455"/>
				<UserParam type="float" name="calcMZ" value="664.51446533203125"/>
				<UserParam type="int" name="pass_threshold" value="1"/>
				<UserParam type="int" name="start" value="191"/>
				<UserParam type="int" name="end" value="227"/>
				<UserParam type="string" name="target_decoy" value="target"/>
				<UserParam type="string" name="isotope_error" value="1"/>
				<UserParam type="string" name="protein_references" value="non-unique"/>
			</PeptideHit>
			<UserParam type="string" name="MS:1001115" value="24975"/>
		</PeptideIdentification>

What could be the problem, this also happens for comet.

Command used and terminal output

No response

Relevant files

No response

System information

No response

ypriverol avatar Mar 25 '24 21:03 ypriverol

https://github.com/OpenMS/OpenMS/blob/079143800f7ed036a7c68ea6e124fe4f5cfc9569/src/topp/MSGFPlusAdapter.cpp#L166 according to this comment in our adapter it is only used if no charge is annotated in the mzML

timosachsenberg avatar Mar 25 '24 21:03 timosachsenberg

@jpfeuffer @timosachsenberg would it make sense to add a parameter to filter the psms in that charge range?

ypriverol avatar Mar 25 '24 21:03 ypriverol

good question. I think these high charge peptides are potentially interesting so one could argue that one wants them to be reported. On the other hand you get more defined / consistent results without filtering. I would probably keep them by default but I could add an optional filter if we decide that we want to filter them

timosachsenberg avatar Mar 26 '24 07:03 timosachsenberg