Symmetry operations (pdbx_struct_oper_list) in mmCIF
It looks like we should use the symmetry operations defined in pdbx_struct_oper_list when reading the structure. Otherwise, we would be missing a subset of atoms in the full structure.
@sguionni said he had a sample file containing this attribute, it would be great if you could share it here!
It looks like most file containing this tag only contain the "identity" transformation, e.g. 4HHB https://github.com/chemfiles/tests-data/blob/0b46e742dedf5cdf6feb3b7376745330b544e26d/cif/4hhb.cif#L6616-L6631
Here is a possible example of file containing this tag with more than one symmetry operation: https://www.rcsb.org/structure/3T6C.
One thing to keep in mind for the implementation is that this tag can be defined inline like in 4HHB, or as a _loop as in 3T6C
I think that a good way to treat this information may be to add an "Assembly" member in the Topology where we can register information about the symmetry. In that way, the user may treat these data in two manners:
- Call a function similar to apply_symmetry in MMTF.cpp which use the assembly data to add all atoms with the right transformations in the topology.
- Let him access to the assembly data and manage itself these information (for example, to send it directly to the GPU and avoid CPU and RAM cost of atoms copy )
If this solution is chosen, every format with assembly data (like MMTF) must manage assemblies in the same way (fill Assembly data instead of directly instantiate atoms for MMTF for example).
Storing symmetry operation separately would be an interesting solution, but will require more work. I would rather fix this by duplicating atoms for now, and introduce an API to work with symmetry in a later release.
Capsid.zip Here is a file (from Yasara)
If nobody has start to resolve this issue, I can do it (with the atom duplication only in a first time).
I don't think anyone is working on this yet, so if you are interested feel free to have a go at it!