CompoundDb icon indicating copy to clipboard operation
CompoundDb copied to clipboard

Default spectra variable names

Open jorainer opened this issue 5 years ago • 8 comments

As suggested by @michaelwitting in issue #61 we should agree on a base nomenclature for compound/spectra identifiers. I am generally no big friend of camelCase in variable names (just too easy to misstype), so I'd suggest to use all in lower case?

Happy for feedback, change requests and expansion of the list @michaelwitting @stanstrup @sneumann

  • InChI: inchi
  • InChIKey: inchikey
  • SMILES: smiles
  • SPLASH: splash ...

jorainer avatar Sep 17 '20 13:09 jorainer

Unsure what kind of things go here exactly.

  • cas
  • local_identifier
  • mslevel
  • manufacturer
  • ionmode
  • precursormz

stanstrup avatar Sep 17 '20 13:09 stanstrup

  • formula formula
  • adduct adduct

michaelwitting avatar Sep 17 '20 13:09 michaelwitting

Once we agreed on a format, I would like to refactor this in MsBackendMassbank, so they spectra can be directly used without any reformatting

michaelwitting avatar Sep 17 '20 13:09 michaelwitting

I would suggest to always use the default for the core spectra variables (see the list of variables in the General description of the Spectra vignette. Thus I would use:

  • msLevel instead of mslevel.
  • precursorMz instead of precursormz.
  • polarity instead of ionmode (if you mean that @stanstrup )

(note my lack of consistency here - sorry for that :( )

jorainer avatar Sep 17 '20 14:09 jorainer

To summarize so far. We have the spectra core variables:

  • acquisitionNum
  • centroided
  • collisionEnergy
  • dataOrigin
  • dataStorage
  • intensity
  • isolationWindowLowerMz
  • isolationWindowTargetMz
  • isolationWindowUpperMz
  • msLevel
  • mz
  • polarity
  • precScanNum
  • precursorCharge
  • precursorIntensity
  • precursorMz
  • rtime
  • scanIndex
  • smoothed

And our "library" variables (plus a few more suggestion from my side)

  • name (required)
  • exactmass (required)
  • adduct (required)
  • formula (required)
  • splash (required)
  • inchi (optional)
  • inchikey (optional)
  • smiles (optional)
  • cas (optional)
  • localidentifier (optional)

In case of lipids we often don' have the exact structure, therefore I would like to have inchi, inchikey, smiles optional.

michaelwitting avatar Sep 24 '20 07:09 michaelwitting

OK, this means I should update/change the code in CompoundDb to use these new column names, right?

jorainer avatar Sep 25 '20 12:09 jorainer

I would guess so... If there are no objections from @stanstrup ?

michaelwitting avatar Sep 25 '20 12:09 michaelwitting

No objections

stanstrup avatar Sep 25 '20 14:09 stanstrup