mdanalysis icon indicating copy to clipboard operation
mdanalysis copied to clipboard

Applying Transformation to Chain of Traj doesn't Apply to First Frame of 2nd, 3rd... trajectories

Open jaclark5 opened this issue 2 years ago • 9 comments

Expected behavior

After creating a universe with multiple trajectory files, and a series of transformations are applied following the protein example. Those transformations are expected to be applied to all frames.

Actual behavior

The first frame of the 2nd, 3rd, etc. trajectories are unwrapped. This is evidenced by writing out the coordinates to the screen, or writing to a pdb file.

It seems to be a discrepancy with the atoms.positions and ts._pos being synced, as the later is used by the Writer.

Code to reproduce the behavior

Run short LJ lammps simulation.

units           lj
atom_style      full

region          box block 0 10 0 10 0 10
create_box      2 box bond/types 1 extra/bond/per/atom 1
create_atoms    1 single 5 5 5.5
create_atoms    1 single 5 5 4.5
create_atoms    2 random 200 1234 box

pair_style      lj/cut 4.5
pair_coeff      * * 1.0 1.0
mass            * 1.0

bond_style      harmonic
bond_coeff      1 1000.0 1.0
create_bonds    single/bond 1 1 2

##########
neighbor 2.0 bin
neigh_modify delay 0 every 1 check yes

min_style cg # conjugate gradient (CG) algorithm
minimize 1.0e-4 1.0e-6 1000 10000

###########

variable TK equal 2
variable Pbar equal 0.01
variable seed equal 1234
variable freq equal 1000

velocity all create ${TK} ${seed} rot yes dist gaussian

fix             1 all nvt temp ${TK} ${TK} 0.05

timestep        0.01
reset_timestep 0
thermo          ${freq}

write_data initial.data
dump 1 all custom ${freq} dump_1.lammpstrj id mol type q xu yu zu
run             5000
undump 1

dump 1 all custom ${freq} dump_2.lammpstrj id mol type q xu yu zu
run             5000

Import LAMMPS trajectories, center around bonded molecule, and write to pdb.

from MDAnalysis import transformations as trans
import MDAnalysis as mda

runs = [1,2]
select = "type 1"

data_file = "run0/initial.data"
dumpfile = "run0/dump_{}.lammpstrj"
dump_file = [dumpfile.format(run) for run in runs]

universe = mda.Universe(data_file, dump_file, format="LAMMPSDUMP")
group_target = universe.select_atoms(select)
group_remaining = universe.select_atoms("all") - group_target
box_center = universe.trajectory[0].dimensions[:3] / 2
com = group_target.center_of_mass(unwrap=True)
transforms = [
    trans.unwrap(group_target),
    trans.translate(box_center-com),
    trans.wrap(group_remaining)
]
universe.trajectory.add_transformations(*transforms)

group = universe.select_atoms("all")
with mda.Writer("output_unwrapped.pdb", group.n_atoms) as W:
    for ii,ts in enumerate(universe.trajectory):
        W.write(group)

Current version of MDAnalysis

  • MDA Version: '2.1.0-dev0'
  • Python Version: 3.8.12
  • MacOS

jaclark5 avatar Apr 28 '22 15:04 jaclark5

It seems that when chain.py uses activate_reader.ts, it is using a separate Timestep (different id() ) with the Timestep instance of the universe. Thus, when wrap.wrap() called the AtomGroup.wrap function, the latter is still on the last frame of the previous trajectory file, and does not wrap the appropriate frame.

jaclark5 avatar Apr 29 '22 14:04 jaclark5

Hi @jaclark5 thanks for the detailed error report! I'm having a little trouble reproducing the problem, I did the code snippet below to try and reproduce and it seems to work fine (writes out a file where atoms either have positions of 0 or 1 for all frames). I'm not sure it will be possible for atoms.positions and ts._pos to become unsynched, but the different id()s for Timestep might be a clue. I'll also dig into the individual transforms you are using to see if the bug is hiding there. So no fix yet, but hopefully we'll find something soon.

import MDAnalysis as mda
from MDAnalysisTests.datafiles import GRO, TRR
from MDAnalysis import transformations

u = mda.Universe(GRO, [TRR, TRR])

def set_every_other_zero(ts):
    ts.positions[::2] = 0.0
    
    return ts

def set_every_other_one(ts):
    ts.positions[1::2] = 1.0
    
    return ts

u.trajectory.add_transformations(set_every_other_zero,
                                 set_every_other_one)

ag = u.atoms[:10]

with mda.Writer('new.pdb') as w:
    for ts in u.trajectory:
        w.write(ag)

richardjgowers avatar May 01 '22 09:05 richardjgowers

Hi @richardjgowers, the issue seems to specifically be with the wrap transformation. The chain reader will (1) activate the new ts, (2) transform the ts trajectory, and then (3) associate that ts with the universe. Unlike other transformations the wrap transformation uses the AtomGroup and thus the universe, however the ts hasn't transitioned to the new trajectory yet. Although I laid out these steps simply, the actual solution in the code is more obscure. Yes, you will notice the id() does not match with the wrap function if you print it in transformation.wrap() and AtomGroup.wrap(), as well as the side effect in the written trajectory I provided where the first frame of the second trajectory is not wrapped.

jaclark5 avatar May 02 '22 12:05 jaclark5

This issue can be resolved with the following two lines of code:

MDAnalysis.transformations.wrap()

     def _transform(self, ts):

+        if id(ts) != id(self.ag.universe.trajectory.ts):
+            self.ag.universe.trajectory.ts = ts
         self.ag.wrap(compound=self.compound)

         return ts

jaclark5 avatar May 03 '22 15:05 jaclark5

@jaclark5 if you've already identified a fix, can I encourage you to create a PR?

We have some documention in the User Guide on Contributing to the Main Code Base.

orbeckst avatar May 12 '22 10:05 orbeckst

@orbeckst it was brought up in the review of my Pull Request that this could also be an issue in the unwrap function. Indeed this is the case. If I run the example provided in this issue with a print statement in the unwrap function I obtain the following output: ts frame, ts id, ag ts frame, ag ts id 1 5662923360 1 5662923360 2 5662923360 2 5662923360 3 5662923360 3 5662923360 4 5662923360 4 5662923360 5 5662923360 5 5662923360 0 5662717984 5 5662923360 1 5662717984 1 5662717984 2 5662717984 2 5662717984 3 5662717984 3 5662717984 4 5662717984 4 5662717984 5 5662717984 5 5662717984

Notice again in the transition from dump_1 to dump_2, that the ts frame is correct, but it never transformed since the atom group ts has not been updated. Should I just add the same if statement to the unwrap class and include it in the same Pull Request?

jaclark5 avatar Sep 19 '22 13:09 jaclark5

Using the same example given in this issue, I looked at fit_translation and fit_rot_trans and I do think it belongs in this Pull Request. To reproduce what I'm describing, run the lammps job provided above with the following python script:

import MDAnalysis as mda

runs = [1,2]
select = "type 1"

data_file = "run0/initial.data"
dumpfile = "run0/dump_{}.lammpstrj"
dump_file = [dumpfile.format(run) for run in runs]
universe1 = mda.Universe(data_file, dump_file, format="LAMMPSDUMP")
universe2 = mda.Universe(data_file, dumpfile.format(2), format="LAMMPSDUMP")

group_ref = universe2.select_atoms(select)
group_target = universe1.select_atoms(select)
transforms = [
    trans.fit_translation(group_target, group_ref, plane="xy"),
    trans.fit_rot_trans(group_target, group_ref, plane="xy"),
]
universe1.trajectory.add_transformations(*transforms)

group = universe1.select_atoms("all")
with mda.Writer("output_fit.pdb", group.n_atoms) as W:
    for ii,ts in enumerate(universe1.trajectory):
        W.write(group)

In transformations.fit.py add the following line to the trans._tranform(self, ts) methods:

print("translate", ts.frame, id(ts), self.mobile.universe.trajectory.ts.frame, id(self.mobile.universe.trajectory.ts))
print("rot_trans", ts.frame, id(ts), self.mobile.universe.trajectory.ts.frame, id(self.mobile.universe.trajectory.ts))

Output: trans type, ts frame, ts id, mobile frame, mobile id translate 1 5774703920 1 5774703920 rot_trans 1 5774703920 1 5774703920 translate 2 5774703920 2 5774703920 rot_trans 2 5774703920 2 5774703920 translate 3 5774703920 3 5774703920 rot_trans 3 5774703920 3 5774703920 translate 4 5774703920 4 5774703920 rot_trans 4 5774703920 4 5774703920 translate 5 5774703920 5 5774703920 rot_trans 5 5774703920 5 5774703920 translate 0 5774568176 5 5774703920 rot_trans 0 5774568176 5 5774703920 translate 1 5774568176 1 5774568176 rot_trans 1 5774568176 1 5774568176 translate 2 5774568176 2 5774568176 rot_trans 2 5774568176 2 5774568176 translate 3 5774568176 3 5774568176 rot_trans 3 5774568176 3 5774568176 translate 4 5774568176 4 5774568176 rot_trans 4 5774568176 4 5774568176 translate 5 5774568176 5 5774568176 rot_trans 5 5774568176 5 5774568176

Notice again that when the timestep switches to the second trajectory, the atom group uses the last frame of the previous trajectory. The result is that the first frame of the second trajectory is skipped all together.

jaclark5 avatar Sep 19 '22 14:09 jaclark5

As this seems to be a broader problem with a common solution you can do one PR.

orbeckst avatar Sep 19 '22 15:09 orbeckst

Hey sure! I can get to it next week.

On May 12, 2022, at 2:34 AM, Oliver Beckstein @.***> wrote:

 @jaclark5 if you've already identified a fix, can I encourage you to create a PR?

We have some documention in the User Guide on Contributing to the Main Code Base.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

jaclark5 avatar Oct 11 '22 09:10 jaclark5