zarr-python icon indicating copy to clipboard operation
zarr-python copied to clipboard

Faster delitems

Open d-v-b opened this issue 2 years ago • 3 comments

fixes #1336

We first try to delete keys without checking if they exist in storage. If the storage backend treats deleting missing keys as error, and it signals such an error by raising FileNotFoundError, then keys will be filtered based on whether they exist in storage. This is much slower.

TODO:

  • [ ] Add unit tests and/or doctests in docstrings
  • [ ] Add docstrings and API docs for any new/modified user-facing classes and functions
  • [ ] New/modified features documented in docs/tutorial.rst
  • [ ] Changes documented in docs/release.rst
  • [ ] GitHub Actions have all passed
  • [ ] Test coverage is 100% (Codecov passes)

d-v-b avatar Sep 14 '23 22:09 d-v-b

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (8ac8553) 99.99% compared to head (769e76e) 100.00%. Report is 1 commits behind head on main.

:exclamation: Current head 769e76e differs from pull request most recent head e2c0ce4. Consider uploading reports for the commit e2c0ce4 to get more accurate results

Additional details and impacted files
@@            Coverage Diff            @@
##             main     #1524    +/-   ##
=========================================
  Coverage   99.99%   100.00%            
=========================================
  Files          38        37     -1     
  Lines       14563     14740   +177     
=========================================
+ Hits        14562     14740   +178     
+ Misses          1         0     -1     
Files Coverage Δ
zarr/storage.py 100.00% <100.00%> (ø)

... and 30 files with indirect coverage changes

codecov[bot] avatar Sep 14 '23 23:09 codecov[bot]

This is fine, but rm() in fsspec should really allow on_error on all calls consistently. It's worth noting that the backend that doesn't ignore FileNotFound (i.e., local) is also the one that deletes files sequentially as opposed to async.

martindurant avatar Sep 16 '23 17:09 martindurant

is also the one that deletes files sequentially

I mean, we could try/except on individual files in this case and still not bother explicitly checking existence first.

martindurant avatar Sep 16 '23 17:09 martindurant

I'm going to close this as stale. Folks should feel free to reopen if there is interest in continuing this work.

jhamman avatar Oct 11 '24 23:10 jhamman