TileDB-Py icon indicating copy to clipboard operation
TileDB-Py copied to clipboard

Writing metadata to a timestamp leads to race conditions

Open kylemann16 opened this issue 7 months ago • 0 comments

I've recently run into an issue where my metadata doesn't seem to be updating quickly when writing to a specific timestamp, even on small writes with tiny databases, as you can see in the example I've included.

In the scenario like the following, I am routinely getting either a KeyError (metadata doesn't exist) or an AssertionError (isn't the correct value).

    JSON_VALUE='{"a":"1"}'
    with tiledb.open(array_uri, "w", timestamp=(a,a), ctx=ctx) as A:
        A.meta["aaa"] = JSON_VALUE

    # Open array for reading
    with tiledb.open(array_uri, timestamp=(a,a), ctx=ctx) as A2:
        try:
            assert A2.meta["aaa"] == JSON_VALUE
        except KeyError:
            print(f'Failed to insert first value at timestamp=({a}, {a})')
        except AssertionError:
            print(f'Failed to overwrite data at timestamp=({a}, {a})')

I don't necessarily think there's anything wrong, because it's a race condition with a database and I think it's fair to expect users to handle concurrency.

I am wondering if there is a prescribed way to deal with this though. Is it possible to wait for a result to be written to the metadata? Or do I just need to be reading and validating that it's been written before moving on? I don't think I'm seeing anything in the docs for this.

I appreciate any help/direction, thank you!

Example code: https://gist.github.com/kylemann16/35bc2cf3ddc6929e9373ec3c665a58fb

versions:

# Name                    Version                   Build  Channel
libgdal-tiledb            3.10.3              hb4b1b75_10    conda-forge
tiledb                    2.28.0               h8c23ae0_0    conda-forge
tiledb-py                 0.34.0          py313h3af326a_1    conda-forge

kylemann16 avatar Jun 06 '25 15:06 kylemann16