brainpy
brainpy copied to clipboard
Difference in about 7.5 m/z with predicted and experimental spectra.
I'm not a mass spec person, but I've recently had to analyza a compound with formula that has multiple copper and sulphur atoms, and brainpy gives me pretty large deviations from the experimental spectra; in particular, I get a ~ 7.5 m/z deviation for all peaks in the spectrum. These deviations are not observed in other spectra MS prediction software, such as the online applet https://www.sisweb.com/mstools/isotope.htm In particular, for a molecular formula H60C46Cu3Se1N3 calling brainpy with:
from brainpy import isotopic_variants
m = {'H': 60, 'C': 46, 'Cu':3, 'Se': 1, 'N':3}
theoretical_isotopic_cluster = isotopic_variants(m, npeaks=10, charge=0)
for peak in theoretical_isotopic_cluster:
print(peak.mz, peak.intensity)
Produces as a result
923.1840377386 0.0022227601516974682
924.1872973029356 0.0011455754382822827
925.3231309664317 0.02666552921958156
926.3839931791374 0.03269740350959813
927.4379433948883 0.10527109615812123
928.4370417263187 0.07597124620984472
929.5678061770984 0.24311678266903156
930.5656801824274 0.1288335183195178
931.637296795344 0.2601097788982547
932.6444988578364 0.12396630942607062
It seems to me that brainpy is predicting 931.64 to be the m/z with the maximum intensity, but https://www.sisweb.com/mstools/isotope.htm using high resolution, H60C46Cu3Se1N3 and minimum abundance of 0.01% predicts
924.18108 0.81
924.18381 6.83
924.18602 0.08
924.18637 23.57
924.18647 4.02
924.1874 37.21
924.18794 0.06
924.18857 0.05
(note that these are only some example peaks which contain the max value) which means 924.1874 is the m/z with maximum intensity, and these results agree 100% with the experimental spectrum, which makes me suspect a bug in brainpy.
Based upon the scale of the m/z deltas your reporting, what you're seeing is isotopic fine structure, where the delta m/z between peaks is the difference between the masses of neutrons of different elements rather than differences between number of extra neutrons.
brainpy implements the BRAIN algorithm which collapses fine structure, as described in “BRAIN: a universal tool for high-throughput calculations of the isotopic distribution for mass spectrometry.,” Anal. Chem., vol. 85, no. 4, pp. 1991–4, Feb. 2013.
If you want a tool to generate fine structure, check out https://github.com/MatteoLacki/IsoSpec which has Python bindings at IsoSpecPy already available on the Python package index.
Isotopic fine structure is only visible above a certain resolution, and will depend upon your instrument. Do you know what the settings on your instrument (detector type, resolution) and whether you see larger isotopologues or isotopomers?
Thank you for the prompt response!
I'm not exactly sure what you mean by "difference in masses of neutrons", as I said I'm not knowledgable about this topic, but my guess is that the deviation I'm seeing, which is about ~ 7.5 m/z is a bit too large. I don't really care about anything after the first 3-4 sig figs (i.e. I care if the number is 924 m/z or 931 m/z but nothing else) I don't really need that level of accuracy, my issue is that brainpy predicts about ~ 931 m/z as the largest peak, but even a naive calculation of the m/z by using the masses of the most abundant isotopes of each element gives a value between 922-924 m/z in this case.
I knew BRAIN implemented an approximation but I didn't know it would be possible for it to fail to this extent (~ 7.5 m/z or 1% of the m/z value), but maybe I'm misunderstanding the use case of this algorithm, I'm guessing this makes much more sense being designed for much larger biomolecules / proteins, and not so much for small organic molecules such as the one I'm trying to predict.
I used the package since the example shown in the brainpy's README plots a MS spectra of a relatively small organic molecule at approximately this scale, (800 - 805 m/z) so I recommend some sort of disclaimer is added to the documentation if these differences are to be expected.
Thanks a lot for the IsoSpec recommendation! I will definitely check that out, and thank you again for helping with this issue.
Also, I wanted to add that the difference of m/z between peaks is pretty accurate, the issue is that all peaks seem to be displaced by ~ 7 m/z, which lead my further to believe that this is some sort of bug that is displacing values by ~ 7 in this case, instead of an artifact of some approximation in the algorithm, which I would guess would probably give very small random deviations for each peak, but again, maybe this is just a consequence of this "fine structure collapse" that I'm not fully grasping.
I attach for comparison the values predicted by the applet using a low resolution spectra, as you can see even in this case the maximum peak is ~ 925 m/z, pretty far from brainpy's prediction of 931
917 0.9
918 0.4
919 9.9
920 12.3
921 39.6
922 28.7
923 92.5
924 49.2
925 100
926 47.9
927 54.3
928 23.8
929 15.3
930 5.8
931 2
932 0.6
933 0.1
Thank you for reporting the issue, and then clarifying it when I dismissed it in haste.
I was paying too much attention to that second peak list thinking you were talking about distance between peaks. Now I understand it's that the theoretical most abundant peak is wrong. I think the issue is related to Selenium, not an element I've dealt with at this level of fundamentals.
Visualizing the pattern for the composition you're using:

Selenium's isotopes are weird:
In [11]: Se = brainpy.periodic_table['Se']
In [12]: Se.isotopes
Out[12]:
OrderedDict([(-6,
Isotope(abundance=0.0089, neutrons=74, mass=73.9224764, neutron_shift=-6)),
(-4,
Isotope(abundance=0.0937, neutrons=76, mass=75.9192136, neutron_shift=-4)),
(-3,
Isotope(abundance=0.0763, neutrons=77, mass=76.919914, neutron_shift=-3)),
(-2,
Isotope(abundance=0.2377, neutrons=78, mass=77.9173091, neutron_shift=-2)),
(0,
Isotope(abundance=0.4961, neutrons=80, mass=79.9165213, neutron_shift=0)),
(2,
Isotope(abundance=0.0873, neutrons=82, mass=81.9166994, neutron_shift=2))])
If I try to simplify the problem, I can isolate it:
In [33]: brainpy.isotopic_variants({"Se": 1}, npeaks=10)
Out[33]:
[Peak(mz=79.916521, intensity=0.086745, charge=0),
Peak(mz=82.075165, intensity=0.913255, charge=0)]
Note how the minimal mass peak (the "monoisotopic" peak) has a mass equal to the true monoisotopic mass for Se[78], but has the abundance of Se[74], while
I suspect that there is an error in the logic somewhere under https://github.com/mobiusklein/brainpy/blob/master/brainpy/_c/isotopic_constants.pyx#L274, probably in https://github.com/mobiusklein/brainpy/blob/master/brainpy/_c/isotopic_constants.pyx#L102 where it assumes that the lowest isotope is 0 or at least non-negative, which is not the case for Se.
Unfortunately, I have a pretty full log of work right now. If IsoSpec can do the job for you, I'd rather you use that than wait while I try to fix this now, as it'd take me a while just to get to this.
Ahh, I see what the issue may be! I think you narrowed it down pretty well. IsoSpec does the job for me right now, so I've switched to that, I'm also pretty busy but in the future if you can't find time to fix this I may do a PR, since I like the documentation/API of brainpy a bit more.
Hi both,
I know I'm about 2 years late on this, but wanted to say I really appreciate this package and that I think I have a resolution to this. I seemed to notice this behavior with iron (in heme) and found that there were a couple things to resolve (in the pure pythonic):
- Order was being set with max_variants, which assumed a base mass of >= 0 neutron shifts, so you can adjust brainpy's L370 to subtract the min_neutron_shift (adding everything - and onwards):
max_n_variants += count * (periodic_table[element].max_neutron_shift() - periodic_table[element].min_neutron_shift()) - The center_mass was iterating from 0 ascendingly, so the quick fix was to ascend from the min neutron shift instead. You can set something like a low_mass parameter in center_mass equal to the min_neutron_shift, then update the center parameter to have the final term be relative to that lowest mass (replacing L591 with).
low_mass = self._isotopic_constants[element].element.min_neutron_shift() center += self.composition[element] * sign * ele_sym_poly[i] *\ self.monoisotopic_peak.intensity * self._isotopic_constants[element].element.isotopes[low_mass].mass
This should allow you to undo the comment for Fe if I'm not mistaken.
Anyway, thanks again for sharing this, hope this captures your intent.
Thank you for the suggested solution. Is this the region you are referring to modifying for the second point?
https://github.com/mobiusklein/brainpy/blob/master/src/brainpy/brainpy.py#L576-L583
for element, ele_sym_poly in ele_sym_poly_map.items():
center += self.composition[element] * sign * ele_sym_poly[i] *\
self.monoisotopic_peak.intensity * self._isotopic_constants[element].element.monoisotopic_mass()
Yes! That's correct. Apologies for the delay and for not linking the code directly.