mdanalysis icon indicating copy to clipboard operation
mdanalysis copied to clipboard

No need to redirect to the original frame explicitly during deserialization

Open yuxuanzhuang opened this issue 2 years ago • 17 comments

Fixes #3723 Partial fix: #3721

Changes made in this Pull Request:

  • no redirection to the original frame (explicitly) during deserialization
  • TextIOPicklable serializes the raw class instance instead of class name.

PR Checklist

  • [x] Tests?
  • [x] Docs?
  • [x] CHANGELOG updated?
  • [x] Issue raised/referenced?

yuxuanzhuang avatar Jun 19 '22 18:06 yuxuanzhuang

Hello @yuxuanzhuang! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! :beers:

Comment last updated at 2024-04-03 14:10:03 UTC

pep8speaks avatar Jun 19 '22 18:06 pep8speaks

Codecov Report

Attention: Patch coverage is 95.00000% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 93.64%. Comparing base (73acc9b) to head (1a6272e).

Files Patch % Lines
package/MDAnalysis/lib/picklable_file_io.py 94.44% 1 Missing :warning:
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #3722      +/-   ##
===========================================
- Coverage    93.66%   93.64%   -0.02%     
===========================================
  Files          168      180      +12     
  Lines        21248    22332    +1084     
  Branches      3917     3918       +1     
===========================================
+ Hits         19902    20913    +1011     
- Misses         888      961      +73     
  Partials       458      458              

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Jun 19 '22 22:06 codecov[bot]

Once you need someone to review, feel free to ping/request reviewers.

I'd also suggest using the "convert to draft" in GitHub instead of just adding "WIP" to the issue title as it shows more intentionally if you currently want reviews or not.

orbeckst avatar Jun 22 '22 17:06 orbeckst

This is probably an improvement, but I think it'll be a change in behaviour so we'll need to raise warnings etc.

The behaviour change is how the reading location TextIOPicklable is recovered. Before it is done by universe.trajectory[i]. Now it is directly serialized. Not sure if there's any explicitly warning that should be raised anywhere?

yuxuanzhuang avatar Jun 30 '22 12:06 yuxuanzhuang

@richardjgowers Okay, I never thought about preserving the modification of coordinates (before).

But the behavior does change after this PR, i.e. now the modification is preserved (because no data reloading from the file is performed after this line is removed) which is more "correct".

u = mda.Universe(XYZ)

print('Orig pos:,', u.trajectory.ts.positions[0])
u.trajectory.ts.positions[0] = np.array([1, 2, 3])
print('New pos:,', u.trajectory.ts.positions[0])
u_new = pickle.loads(pickle.dumps(u))
print('After serializing pos:,', u_new.trajectory.ts.positions[0])

# original output
Orig pos:, [ 0.931 17.318 16.423]
New pos:, [1. 2. 3.]
After serializing pos:, [ 0.931 17.318 16.423]

# new output after #PR 3722
Orig pos:, [ 0.931 17.318 16.423]
New pos:, [1. 2. 3.]
After serializing pos:, [1. 2. 3.]

yuxuanzhuang avatar Sep 12 '22 09:09 yuxuanzhuang

@yuxuanzhuang thanks for adding the test - it looks like it found that h5md format has a custom serialisation mechanism

richardjgowers avatar Sep 12 '22 10:09 richardjgowers

Should work now! (too many formats to keep track of :) )

yuxuanzhuang avatar Sep 12 '22 11:09 yuxuanzhuang

@yuxuanzhuang could you resolve the conflicts please?

orbeckst avatar Dec 15 '23 20:12 orbeckst

@yuxuanzhuang and @richardjgowers this PR still looks relevant. Why did it stall? What needs to be done to complete it?

orbeckst avatar Dec 15 '23 20:12 orbeckst

Linter Bot Results:

Hi @yuxuanzhuang! Thanks for making this PR. We linted your code and found the following:

Some issues were found with the formatting of your code.

Code Location Outcome
main package ⚠️ Possible failure
testsuite ⚠️ Possible failure

Please have a look at the darker-main-code and darker-test-code steps here for more details: https://github.com/MDAnalysis/mdanalysis/actions/runs/8540187312/job/23396655061


Please note: The black linter is purely informational, you can safely ignore these outcomes if there are no flake8 failures!

github-actions[bot] avatar Dec 16 '23 11:12 github-actions[bot]

@yuxuanzhuang could you resolve the conflicts please?

Done! Should I worry about the linter bot?

yuxuanzhuang avatar Dec 16 '23 11:12 yuxuanzhuang

@yuxuanzhuang could you resolve the conflicts please?

Done! Should I worry about the linter bot?

Unless it outlines a flake8 failure it's ok to ignore.

Just checked - no flake8 so you're good to go.

IAlibay avatar Dec 16 '23 11:12 IAlibay

I understand why TNGReader needs to re-read the current frame during serialization; however, due to this, the modification of ts cannot be preserved in the TNG format. If I comment it out, all the tests will pass.

        # make sure we re-read the current frame to update C level objects in
        # the file iterator
        self._read_frame(self._frame)

Do you think it's reasonable to only make sure _file_iterator reads the step but no change of ts? @hmacdope

        # convert from frames to integrator steps
        step = self._frame_to_step(self._frame + 1)
        iterator_step = self._file_iterator.read_step(step)

yuxuanzhuang avatar Dec 18 '23 09:12 yuxuanzhuang

I understand why TNGReader needs to re-read the current frame during serialization; however, due to this, the modification of ts cannot be preserved in the TNG format. If I comment it out, all the tests will pass.

        # make sure we re-read the current frame to update C level objects in
        # the file iterator
        self._read_frame(self._frame)

Do you think it's reasonable to only make sure _file_iterator reads the step but no change of ts? @hmacdope

        # convert from frames to integrator steps
        step = self._frame_to_step(self._frame + 1)
        iterator_step = self._file_iterator.read_step(step)

Just checking my understanding @yuxuanzhuang, this means that the coordinate data will be re-read, but that any additional data in ts will become invalid? I am just not sure exactly what is meant by change of ts. Perhaps a super quick example that demos the difference?

Sorry about this mess, tbh this is being caused by my lazyness from way back where I didn't make the cython level classes pickleable in pytng (its a bit complicated for __cinnit__ using classes). See https://github.com/MDAnalysis/pytng/issues/97. If I am understanding correctly it would be better to just be able to pickle everything there, rather than forcing a re-read of the step? I can look at addressing this but may take a while as super busy. Perhaps better to go for an interim solution that you suggested above?

hmacdope avatar Dec 19 '23 22:12 hmacdope

@hmacdope and @richardjgowers can you please check if your raised concerns have been addressed?

If yes, can we merge?

Otherwise, are there bigger issues that we cannot solve?

orbeckst avatar Mar 30 '24 17:03 orbeckst

I am just not sure exactly what is meant by change of ts. Perhaps a super quick example that demos the difference?

I mean modification of e.g. coordinates.

u = mda.Universe(TNG_traj_gro, TNG_traj)
reader = u.trajectory
reader.ts.positions[0] = np.array([1, 2, 3])
reader_p = pickle.loads(pickle.dumps(reader))
assert_equal(reader.ts, reader_p.ts,
                       "Modification of ts not preserved after serialization")

Previously, during unpickling, self._read_frame(self._frame) will 1) update self._file_iterator to the current frame. 2) modify self.ts by reading from self._file_iterator. This will discard any change to the timestep itself.

Now, only self._file_iterator will be reopened and updated. self.ts will be preserved during serialization.

yuxuanzhuang avatar Apr 03 '24 15:04 yuxuanzhuang

@richardjgowers is your review still pertinent or should I just dismiss it?

orbeckst avatar Apr 04 '24 00:04 orbeckst