MassBank-data icon indicating copy to clipboard operation
MassBank-data copied to clipboard

Profile spectra in MassBank

Open kashout opened this issue 2 months ago • 0 comments

Hi all,

I noticed that some MS² spectra in the library appear to contain profile-mode data rather than centroided peak lists. This becomes apparent when zooming in on one of the major peaks. For example:

These spectra show multiple consecutive m/z points forming peak shapes instead of centroided peaks. I assume centroided spectra are the intended format for MassBank submissions.

I screened all MassBank spectra for profile-like characteristics and identified contributors where a large fraction of spectra appear to be profile data:

Contributor Instrument / Setup Likely Profile Spectra Total Spectra Fraction
NAIST LTQ Orbitrap XL (Thermo) 2 621 ~0.3%
RIKEN Waters Xevo G2 Q-Tof 2 10,217 ~0.02%
RIKEN_IMS AB Sciex TripleTOF 5600+ (DuoSpray) 6 753 ~0.8%
RIKEN_IMS TripleTOF 5600 (Peptide BEH C18) 384 386 ~100%
RIKEN_NPDepo ABSciex API3200 LC/MS 327 332 ~98%
RIKEN_NPDepo Agilent 6410 QQQ 1623 1624 ~100%
mFam Agilent 6530 Q-TOF LC/MS 38 764 ~5%
mFam Agilent 6560 IM-QTOF 5 28 ~18%

I manually verified the first three rows in the table because of their low fractions, and they indeed are false positives (spectra are centroided). I think it would be good to curate the bold entries in the table, and other spectra form the same submission batch, as there's probably also some false negatives here. That is, if profile spectra should indeed not be present in the library.

A full list of these likely profile spectra can be found here (incl. false-positives): potential_profile_spectra.csv

Cheers, Kas

kashout avatar Nov 08 '25 12:11 kashout