mdanalysis icon indicating copy to clipboard operation
mdanalysis copied to clipboard

Type guessing should attempt to go via elements first

Open IAlibay opened this issue 1 year ago • 3 comments

Current behaviour

The current behaviour, and something we reinforce in #3753 is that type works by attempting to guess elements from atom names.

Proposed behaviour

The proposal here is to always try to attempt to read from elements first rather than guessing through names. If elements exist AND they are complete, then you return those, rather than guessing them.

Where would this matter?

A good example here is FHIAIMS where:

names == elements types == guessed elements from names

In this case it would have just been "safer" (i.e. fewer code bits gone through) to just do names == elements == types.

Target release

This needs discussion. From my own limited look at things, there aren't any cases where making this behaviour change would negatively impact behaviour. Indeed, I'm not sure I can see any cases where behaviour would change.

IAlibay avatar Aug 31 '24 09:08 IAlibay

cc @lilyminium

IAlibay avatar Aug 31 '24 09:08 IAlibay

Agreed, I think this would make a lot more sense! It also avoids potential weirdness in cases like the RDKitParser, where type guessing can occur as trying to guess the element of atom names that can be variously MonomerInfo names or _TriposAtomNames. However this probably needs @MDAnalysis/coredevs consensus.

lilyminium avatar Sep 02 '24 02:09 lilyminium

Hi there, I was just wondering, would this proposed change address the issue where, as is stated in the MDAnalysis documentation, 'atoms named “CA” are much more likely to represent an alpha-carbon than a calcium atom'? Because I'm trying to analyse a system containing argon (Ar) and MDAnalysis is coming up with an error stating that the atom mass cannot be guessed.

isolated-matrix avatar Oct 05 '24 09:10 isolated-matrix