mdanalysis icon indicating copy to clipboard operation
mdanalysis copied to clipboard

Read biological assembly information where available.

Open BradyAJohnston opened this issue 3 years ago • 4 comments

Is your feature request related to a problem?

I would like to have access to the different symmetry operations required to build a biological assembly from a file if present. (https://pdb101.rcsb.org/learn/guide-to-understanding-pdb-data/biological-assemblies)

Describe the solution you'd like

A list or similar available which contains the different symmetry operations required to build the biological assembly.

Describe alternatives you've considered

I am not necessarily looking to build to the assembly inside of MDAnalaysis, but just have access to the information.

BradyAJohnston avatar May 25 '22 06:05 BradyAJohnston

So currently if you've loaded a PDB file, then any REMARK lines are accessible at Universe.trajectory.remarks, but they aren't parsed into anything better than the raw lines from the file. From a quick skim of the PDB spec, it's not clear how universal the symmetry operations are in the given PDB examples compared to PDBx format.

richardjgowers avatar May 25 '22 07:05 richardjgowers

We could parse the BIOMT matrices if present and expose them, that should be easy enough.

IAlibay avatar May 25 '22 07:05 IAlibay

I'm interested in working on this issue.... the solution that I'm thinking about is adding a topology attribute class BIOMT to the topologyattr module, so it will hold the assembly transformation matrix in a numpy array, with updating the PDBParser to parse the matrix. Am I on the right direction?

aya9aladdin avatar Jun 07 '22 00:06 aya9aladdin

I don't necessarily think we need a new topology attribute for this, it's should be a single matrix for the whole system.

Probably the simplest thing here is to store the matrix under Timestep.data and then have a convenience method (probably under PDBParser) which allows you to use that matrix to yield the biological assembly. As @richardjgowers mentioned, the PDB standard is rather messy here, so there will be assumptions to be made (so we probably don't want to apply any kind of transformation by default).

IAlibay avatar Jun 07 '22 00:06 IAlibay