fix: support alternative atom names within connect_via_residue_names
Addresses https://github.com/biotite-dev/biotite/issues/684
@Croydon-Brixton welcome your thoughts, as I know you've thought a lot about this as well
I found a problem with residues with mixed naming standard: For about 50% of all residues in the CCD there are conflicting names between standard and alternative, i.e. the same name is used for different atoms: For example ZZU swaps O and OXT. This means always mapping from alt to std name would mean, we accidentally identify std as alt names. Thus some bonds would not be found, even if a residue uses std nomenclature.
Hence, I would use your current approach: Only map atoms from a residue, if any atom name is not in the std list of atoms.
I found a problem with residues with mixed naming standard: For about 50% of all residues in the CCD there are conflicting names between standard and alternative, i.e. the same name is used for different atoms: For example
ZZUswapsOandOXT. This means always mapping from alt to std name would mean, we accidentally identify std as alt names. Thus some bonds would not be found, even if a residue uses std nomenclature.Hence, I would use your current approach: Only map atoms from a residue, if any atom name is not in the std list of atoms.
Sounds like a plan. I went with the behavior to only try alternative atom names in the event we cannot map all atoms to standard names for efficiency as well — only in very rare instances (and typically only for non-canonicals/ligands) would such a mapping be needed
Updated to address the comments; given that we only lookup alternative atom names in the rare event that we can't match all of the atoms to standard names, I thought that we didn't need a separate get_std_to_alt_atom_name_map function within atoms; happy to add if you think that would be cleaner!
Thank you for the feedback, and apologies for the delay — I will update and re-submit this weekend!
Closing since in the latest updates of the PDB they no longer have outdated atom names