mzLib copied to clipboard
deconvolution wishlist
Some output I would like to be able to access in PS:
- specific charge state info (intensity, number of charges, monoisotopic m/z)
- apex RT (at what retention time was proteoform most abundant)
Some options I would like for input parameters:
- (What is intensity ratio setting?)
- setting for min S/N ratio of peaks to consider for deconvolution
- setting for minimum charge state to consider for deconvolution (set to 5 for deconvoluting proteins with Thermo Deconvolution)
- setting for RT window size over which to aggregate features
Regarding the first two, I believe the deconvolution output returns enough to extract this information.
Intensity ratio setting is a hard limit on how different relative intensities are allowed to be from averagine intensities. The more molecules we have, the closer the intensity levels should match the theoretical intensities, which are similar to averagine intensities.
Would you define S/N ? In mzml files, the "noise" array is not present, so it must be estimated in some way?
Min charge state - I will add this in.
RT window size: how is this useful? I feel that by restricting RT, we will only deteriorate our deconvolution results.
For the first two, not really. I can get the charges seen and calculate the m/z of each charge state. I don't know intensity of each charge state. In PS we weight the monoisotoipc mass calculation by intensity of charge state. I don't know if this similar calc is already being done.
Rob has found that the min peak in the y array is a good approximation
Is it deconvoluting a given mass across the entire RT range of the whole run? I need to know where each proteoform is most abundant in RT space (the apex RT as stated above) because then we only aggregate/make EE comparisons with experimentals nearby in RT...
What is a good starting point for intensity ratio?
Ah, you're right, the peak objects are hidden in private fields. I will make those public to expose all the information.
Yes, it is deconvoluting across the entire RT range. Once you get all the peak info, you cold extract the most abundant time.
5 worked well for me for intensity ratio
All issues fixed here:
I'd like to add my 2c and say that deconvolution probably shouldn't assume the monoisotopic mass is below the limit of detection if the theoretical monoisotopic intensity is above noise.
This is a complicated way of saying: "Is the monoisotopic peak observed? and if not, should it be observed given the intensity of the isotope envelope and the noise level?"
Sorry, I don't understand
let's discuss when you're in
The way I understand this:
We would like to explore whether sometimes imputed peaks are above the expected noise level, thus should have been present in the spectrum in the first place. In this case the detected match may be wrong.
part of the score could be how gaussian the elution profile looks
Another thought: the "intensity ratio" parameter, that decides how stringent we are with the requirement that intensities match averagine, should be computed automatically/dynamically based on things like TIC/mass/ratio of isotope envelope intensity to TIC/something else.
Instead of just being hard-coded.