pyiron_base icon indicating copy to clipboard operation
pyiron_base copied to clipboard

Update pack and unpack methods

Open samwaseda opened this issue 1 year ago • 3 comments

Update of the pack and unpack methods. https://github.com/pyiron/pyiron_base/issues/775

  • [x] pack includes the csv file inside the archive
  • [x] pack filename is optional, uses ".tar.gz"
  • [x] pack with same name as project should not delete the project
  • [ ] pack selected jobs by id
  • [x] pack all files in a job
  • [x] pack from a different directory than where project is located
  • [x] unpack method can be called as pr = Project(filename.tar.gz, unpack=True) or pr = Project(filename, unpack=True)
  • [x] unpack should not nest project automatically
  • [ ] unpack jobs into existing Project
  • [ ] Update tests
  • [ ] Update docstrings
  • [ ] Update workflow template

samwaseda avatar May 23 '24 08:05 samwaseda

I am going to take over @srmnitc's work, so I copied the PR from his forked branch

samwaseda avatar May 23 '24 08:05 samwaseda

ok I just realized that I had a different PR open and I opened this one on top. Let me correct this first XD

samwaseda avatar May 23 '24 08:05 samwaseda

Thanks @samwaseda for picking this up. There are a couple of fixes also included in https://github.com/pyiron/pyiron_base/pull/1401 so maybe it makes sense to merge those as well.

jan-janssen avatar May 23 '24 11:05 jan-janssen

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Ok this thing is now feature-complete. I might want to do some cleaning but otherwise is ready for a review.

samwaseda avatar Aug 15 '24 12:08 samwaseda

@pmrv @jan-janssen @srmnitc I guess it's now feature complete and this is probably going to be the last PR of the series of pack/unpack. After this one a minor release can be made and it's hopefully settled.

samwaseda avatar Aug 15 '24 16:08 samwaseda

As far as I understand the csv file is now included in the tar archive, correct? How do you handle the backwards compatibility for old archives which do not include the csv file? Maybe it makes sense to have a short example either in the Docstring or the jupyter notebook to handle this backwards compatibility. Finally, I liked the option to take a look at the csv file to see which jobs are included in the archive before importing the corresponding jobs. Previously, I did this by loading the csv file with pandas. I can still load the csv file manually from the tar archive, but for other users it would be great to take a look at the archive and get the job table of the contained project from the python side.

jan-janssen avatar Aug 15 '24 17:08 jan-janssen

As far as I understand the csv file is now included in the tar archive, correct? How do you handle the backwards compatibility for old archives which do not include the csv file? Maybe it makes sense to have a short example either in the Docstring or the jupyter notebook to handle this backwards compatibility.

That's a good point that I should mention in the code. I guess it helps the future generation to understand the origin of some of the code.

Finally, I liked the option to take a look at the csv file to see which jobs are included in the archive before importing the corresponding jobs. Previously, I did this by loading the csv file with pandas. I can still load the csv file manually from the tar archive, but for other users it would be great to take a look at the archive and get the job table of the contained project from the python side.

That sounds good but I guess it's a different PR.

samwaseda avatar Aug 15 '24 19:08 samwaseda

Can I merge this one or should I still wait for a review?

samwaseda avatar Aug 16 '24 09:08 samwaseda