zarr-python
zarr-python copied to clipboard
Fix a TOCTOU race in `open_array(mode="a")`
If you have multiple writers trying todo zarr.open_array(..., "a"),
the old code code sometimes raise a ContainsArrayError, even thogh
opening an already-existing array should be fine.
The underlying problem is a time-of-check/time-of-use race condition,
where the array can be created (by another writer) between the
contains_array check and the init_array call.
To fix this, we simply swallow the error from the init_array call.
I'm not sure how best to test this (since the bug is inherently non-deterministic based on scheduling), but I'm happy to take suggestoins.
Codecov Report
Merging #963 (4cfc5bb) into main (5cdb202) will decrease coverage by
0.02%. The diff coverage is91.42%.
:exclamation: Current head 4cfc5bb differs from pull request most recent head 05396aa. Consider uploading reports for the commit 05396aa to get more accurate results
@@ Coverage Diff @@
## main #963 +/- ##
==========================================
- Coverage 99.94% 99.92% -0.03%
==========================================
Files 34 34
Lines 13847 13744 -103
==========================================
- Hits 13840 13734 -106
- Misses 7 10 +3
| Impacted Files | Coverage Δ | |
|---|---|---|
| zarr/creation.py | 99.40% <75.00%> (-0.60%) |
:arrow_down: |
| zarr/tests/test_creation.py | 99.62% <93.54%> (-0.38%) |
:arrow_down: |
| zarr/hierarchy.py | 99.78% <0.00%> (-0.01%) |
:arrow_down: |
| zarr/core.py | 100.00% <0.00%> (ø) |
|
| zarr/storage.py | 100.00% <0.00%> (ø) |
|
| zarr/indexing.py | 100.00% <0.00%> (ø) |
|
| zarr/convenience.py | 100.00% <0.00%> (ø) |
|
| zarr/_storage/store.py | 100.00% <0.00%> (ø) |
|
| zarr/tests/test_core.py | 100.00% <0.00%> (ø) |
|
| zarr/tests/test_storage.py | 100.00% <0.00%> (ø) |
|
| ... and 4 more |
Thanks, @alanhdu. ~Do you have any ideas for a test that would trigger the race condition so we can keep the code coverage up?~ (Missed your comment) To get code coverage passing, we could potentially add a # pragma: no cover to the line. An idea would be to at least try the same operation from a number of threading.Thread instances. Using a Event e.g. might help get all threads to the same state together.
(Edit) For example:
import threading
import logging
logging.basicConfig(level=logging.DEBUG,
format='(%(threadName)-9s) %(message)s',)
event = threading.Event()
class Thread(threading.Thread):
def __init__(self, event):
threading.Thread.__init__(self)
self.event = event
def run(self):
logging.debug("do some setup")
self.event.wait()
logging.debug("now attempt the race condition")
threads = [Thread(event) for i in range(10)]
for t in threads:
t.start()
event.set()
However there are also a number of tests under zarr/tests/test_sync.py might help you.
@alanhdu : do you want to give this a try?
@alanhdu: I've added my test proposal but I've not yet been able to reproduce a failure.
p.s. @Alt-Shivam, thanks!
flake8...................................................................Failed
- hook id: flake8
- exit code: 1
zarr/tests/test_creation.py:733:5: F841 local variable 'expected_zarr_version' is assigned to but never used
Check Yaml...............................................................Passed