pymatgen icon indicating copy to clipboard operation
pymatgen copied to clipboard

SpaceGroup object init and string representation issues for space group types with screw axes in non-standard settings

Open kaueltzen opened this issue 1 year ago • 3 comments

Python version

3.10

Pymatgen version

2024.5.1 or master

Operating system version

No response

Current behavior

This issue was noticed when working on #3859 and #3861 .

In SpaceGroup.SYMM_OPS the translational information in screw axes is appended without an underscore, e.g. "Pmc21" while in SpaceGroup.sg_encoding (comprising the standard settings and a bit more, it seems?) this information is represented with an underscore. If one initializes a SpaceGroup object as in https://github.com/materialsproject/pymatgen/blob/163cc01e0251525f5ab8f48776511411754012a9/tests/symmetry/test_groups.py#L211 with an underscore, no match in SpaceGroup.SYMM_OPS is found and the attributes are set starting from L241 based on SpaceGroup.sg_encoding and the int_symbol parameter: https://github.com/materialsproject/pymatgen/blob/163cc01e0251525f5ab8f48776511411754012a9/pymatgen/symmetry/groups.py#L241


The problem arises if one wants to initialize SpaceGroup objects with screw axes in non-standard settings. An example is P2_1ma, a non-standard representation of Pmc2_1. But while this does not give an error

sg = SpaceGroup("Pmc2_1")
assert sg.to_unicode_string() == "Pmc2₁"

the non-standard setting raises the ValueError from https://github.com/materialsproject/pymatgen/blob/163cc01e0251525f5ab8f48776511411754012a9/pymatgen/symmetry/groups.py#L239

sg = SpaceGroup("P2_1ma")
assert sg.to_unicode_string() == "P2₁ma"

If one leaves out the underscore, the SpaceGroup object can be instantiated, but it will generate a wrong unicode string:

sg = SpaceGroup("P21ma")
assert sg.to_unicode_string() == "P2₁ma"

AssertionError
Expected :'P2₁ma'
Actual   :'P21ma'

Possible solutions:

  1. A quick fix would be to not remove whitespaces in the Hermann-Mauguin symbol when loading SYMM_OPS: https://github.com/materialsproject/pymatgen/blob/163cc01e0251525f5ab8f48776511411754012a9/pymatgen/symmetry/groups.py#L185 It might be even best if we either added underscores to screw axes for entries in SYMM_OPS (would prefer) or remove them for entries in sg_encodings. Right now, the same space group type in the same setting will give different symbols depending on whether one follows the LaTeX advice in the documentation of init or not (not following it does not give an error, but initializes from SYMM_OPS instead
sg = SpaceGroup("P4_1")
print(sg.symbol)
sg = SpaceGroup("P41")
print(sg.symbol)

P4_1
P41
  1. Further solution: In my opinion, a single database with space group type information would be best to avoid bugs like this. However, this would require a lot more work (rewriting the init but also creating this database) and careful testing. Thoughts?

Expected Behavior

A successful SpaceGroup initialization and correct string representation for space group types with screw axes in non-standard settings.

Minimal example

No response

Relevant files to reproduce this bug

No response

kaueltzen avatar Jun 05 '24 11:06 kaueltzen

wow, love your detailed issue reports with discovery context and repro steps. keep that up! 👍

regarding the issue, your proposed fix sounds good to me. i hope others with more experience weigh in though

fix would be to not remove whitespaces in the Hermann-Mauguin symbol when loading SYMM_OPS

janosh avatar Jun 05 '24 13:06 janosh

wow, love your detailed issue reports with discovery context and repro steps.

Seconded, much appreciated!

In my opinion, a single database with space group type information would be best to avoid bugs like this.

Yes, I agree. I'm not sure if there should even be a HermannMauguinSymbol class that can handle parsing logic, convert to/from full, short and extended symbols, etc. Perhaps we're asking str to do too much work here.

mkhorton avatar Jun 09 '24 03:06 mkhorton

Yes, I agree. I'm not sure if there should even be a HermannMauguinSymbol class that can handle parsing logic, convert to/from full, short and extended symbols, etc. Perhaps we're asking str to do too much work here.

I like the idea of an additional HermannMauguinSymbol class for these tasks :+1: However, I think I would not have the required time to implement that in the near future. For now, this issue is solved with backwards compatibility by an additional "hermann_mauguin_u" key in the symm_ops.json database (this is documented in dev_scripts/update_spacegroup_data.py). This converts the symm_ops.json "hermann_mauguin" key (and later in groups.py in some cases user input) to the underscore notation of symm_data.json.

kaueltzen avatar Jun 10 '24 10:06 kaueltzen