cg_openmm icon indicating copy to clipboard operation
cg_openmm copied to clipboard

Speeding up process_replica_exchange

Open cwalker7 opened this issue 4 years ago • 2 comments

I've noticed that process_replica_exchange in rep_exch.py can be pretty slow - particularly this block of code which writes the .dat file:

f = open(os.path.join(output_directory, "replica_energies.dat"), "w")
for step in range(total_steps):
    f.write(f"{step:10d}")
    sampler_states = reporter.read_sampler_states(iteration=step)
    for replica_index in range(n_replicas):
        replica_positions[replica_index, step, :, :] = sampler_states[replica_index].positions
        f.write(f"{replica_energies[replica_index,replica_index,step]:12.6f}")
    f.write("\n")
f.close()

It takes about 10 minutes to write .dat for 1 million frames, which is ~75% of the entire process_replica_exchange time (excluding writing pdb or dcd trajectories files). I suspect it is how we are writing each individual energy one at a time, but will have to check.

cwalker7 avatar Sep 16 '20 22:09 cwalker7

Interesting. I don't have an immediate thought, other than that we may not actually need to write ascii files much of the time, this part could perhaps be made optional?

mrshirts avatar Sep 16 '20 22:09 mrshirts

Ok I think it makes sense to have the .dat file be optional for now. Currently we are only using that file for the physical validation, but it could just as easily be read in as a pickle or by numpy.save and load.

cwalker7 avatar Sep 17 '20 14:09 cwalker7