mdanalysis
mdanalysis copied to clipboard
XYZWriter odd behaviour
Expected behavior
If I reorder some atoms I expect to get the right name along the right coordinates.
Original coordinates: O 0.00000 0.00000 0.00000 N 2.00000 0.00000 0.00000 H 0.00000 1.50000 0.00000
Expected coordinates after reordering atoms 1 and 2: O 0.00000 0.00000 0.00000 H 0.00000 1.50000 0.00000 N 2.00000 0.00000 0.00000
Actual behavior
I obtain: O 0.00000 0.00000 0.00000 H 2.00000 0.00000 0.00000 N 0.00000 1.50000 0.00000
The symbols are reordered but the coordinates remain unchanged.
Code to reproduce the behavior
import MDAnalysis as mda
u = mda.Universe("uno.xyz")
firstframe = 0
lastframe = 1
increment = 1
frames = slice(firstframe,lastframe,increment)
for ts in u.trajectory[frames]:
idx_lst_sort = [0, 2, 1]
ordenados = u.atoms[idx_lst_sort]
print(u.atoms)
print(u.atoms.positions)
print(ordenados.atoms)
print(ordenados.atoms.positions)
with mda.Writer("dos.xyz", ordenados.n_atoms) as W:
W.write(ordenados)
with mda.Writer("dos.pdb", ordenados.n_atoms) as W:
W.write(ordenados)
The uno.xyz file is:
3
frame 0
O 0.00000 0.00000 0.00000
N 2.00000 0.00000 0.00000
H 0.00000 1.50000 0.00000
The dos.xyz file:
3
frame 0 | Written by MDAnalysis XYZWriter (release 2.2.0)
O 0.00000 0.00000 0.00000
H 2.00000 0.00000 0.00000
N 0.00000 1.50000 0.00000
The new order is not respected by XYZWriter while PDBWriter does.
The dos.pdb file is:
TITLE MDANALYSIS FRAMES FROM 0, STEP 1: Created by PDBWriter
CRYST1 1.000 1.000 1.000 90.00 90.00 90.00 P 1 1
REMARK 285 UNITARY VALUES FOR THE UNIT CELL AUTOMATICALLY SET
REMARK 285 BY MDANALYSIS PDBWRITER BECAUSE UNIT CELL INFORMATION
REMARK 285 WAS MISSING.
REMARK 285 PROTEIN DATA BANK CONVENTIONS REQUIRE THAT
REMARK 285 CRYST1 RECORD IS INCLUDED, BUT THE VALUES ON
REMARK 285 THIS RECORD ARE MEANINGLESS.
MODEL 1
ATOM 1 O UNK X 1 0.000 0.000 0.000 1.00 0.00 SYST O
ATOM 2 H UNK X 1 0.000 1.500 0.000 1.00 0.00 SYST H
ATOM 3 N UNK X 1 2.000 0.000 0.000 1.00 0.00 SYST N
ENDMDL
END
Current version of MDAnalysis
- Which version are you using? (run
python -c "import MDAnalysis as mda; print(mda.__version__)"
) 2.2.2 - Which version of Python (
python -V
)? Python 3.9.13 - Which operating system? openSUSE Leap 15.4
Looks like the issue here is that we use a Timestep object slice to write out coordinates instead of an atomgroup (whilst we do use the atomgroup to get the elements). So we lose re-ordering information when writing out positions.
I guess our best approach here is just to switch over to plainly using atomgroups for everything rather than relying on Timestep.
If the change from Timestep to atomgroups is not trivial, maybe something like: MDAnalysis/coordinates> diff XYZ.py.ORIG XYZ.py
241c241
< # update atom names
---
> # update atom names & positions
242a243
> ts.positions = atoms.atoms.positions
could work for the time been?
Is this behavior inconsistent between writers?
I only had time to look at XYZ. To me it looks as if we should do https://github.com/MDAnalysis/mdanalysis/blob/2c32ed11c1e6b800397f147e4242f7325900b669/package/MDAnalysis/coordinates/XYZ.py#L238 if we detect that the Atomgroup indices are not the same as the indices of the original universe.atoms
. At the moment, the copy_slice()
is only performed if we have fewer atoms.
I am not 100% sure if, for XYZ, there was any advantage to using a slice on a Timestep instead of using atoms.positions
, which performs the same slicing operation https://github.com/MDAnalysis/mdanalysis/blob/2c32ed11c1e6b800397f147e4242f7325900b669/package/MDAnalysis/core/groups.py#L2775
but Timestep.copy_slice()
should make sure that everything (forces, velocities, ...) is properly copied — this might not matter for XYZ but it matters for others. For consistency, we should use copy_slice()
everywhere when necessary but use the original Timestep when possible (for performance).
The behaviour is different between the XYZwriter and the PDB and GRO writers. I haven't tested the writers which produces binary output. PDB and GRO get the names and coordinates changed.
Thanks for checking @rafaelpap . I suppose we'll need to have a careful look around the code of the other writers to try to make it consistent.