tensorstore
tensorstore copied to clipboard
Transactional/ACID semantics
I have a general question in regard to:
- Natively supports multiple storage drivers, including Google Cloud Storage, local and network filesystems, in-memory storage.
- Support for read/writeback caching and transactions, with strong atomicity, isolation, consistency, and durability (ACID) guarantees.
and this sentence in the Blog:
Safety of parallel operations when many machines are accessing the same dataset is achieved through the use of optimistic concurrency, which maintains compatibility with diverse underlying storage layers (including Cloud storage platforms, such as GCS, as well as local filesystems) without significantly impacting performance. TensorStore also provides strong ACID guarantees for all individual operations executing within a single runtime.
I created a dummy dataset with the zarr + S3 drivers:
2024-04-04 15:33:22 230 ts/yang-test-dataset/.zarray
2024-04-04 16:43:54 48573 ts/yang-test-dataset/0.0.0
2024-04-04 16:43:54 48573 ts/yang-test-dataset/0.0.1
2024-04-04 16:43:54 48573 ts/yang-test-dataset/0.0.2
2024-04-04 16:43:54 48573 ts/yang-test-dataset/0.0.3
2024-04-04 16:43:54 48573 ts/yang-test-dataset/0.0.4
2024-04-04 16:43:54 48573 ts/yang-test-dataset/0.0.5
2024-04-04 16:43:54 48573 ts/yang-test-dataset/0.0.6
2024-04-04 16:43:54 48573 ts/yang-test-dataset/0.0.7
2024-04-04 16:43:54 48573 ts/yang-test-dataset/0.0.8
2024-04-04 16:43:54 48573 ts/yang-test-dataset/0.0.9
and then created a situation where the next write to chunk 0.0.3 would fail. Running under a transaction
with ts.Transaction() as txn:
result = ds.with_transaction(txn)[80:82, 99:102, :] = [[[1],[2],[3]], [[4], [5], [6]]]
would throw
Traceback (most recent call last):
File "/home/yang.yang/workspaces/tensorstore/.yang/foo.py", line 33, in <module>
with ts.Transaction() as txn:
ValueError: PERMISSION_DENIED: Error writing "ts/yang-test-dataset/0.0.3": HTTP response code: 403 with body: <?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>SK6BWG5ESTC2NVJ6</RequestId><HostId>5z3/QZmVne5TyFJUH0A0swSAtyyhsl47I/z7AjULiGmsj1QAtf3JEA6d/TAuWH/ts1xCHJmVucM=</HostId></Error> [source locations='tensorstore/kvstore/s3/s3_key_value_store.cc:777\ntensorstore/kvstore/kvstore.cc:373'
but the S3 bucket after this operation looks like this:
2024-04-04 15:33:22 230 ts/yang-test-dataset/.zarray
2024-04-04 17:14:57 48573 ts/yang-test-dataset/0.0.0
2024-04-04 17:14:58 48573 ts/yang-test-dataset/0.0.1
2024-04-04 17:14:58 48573 ts/yang-test-dataset/0.0.2
2024-04-04 16:43:54 48573 ts/yang-test-dataset/0.0.3 <--- not updated
2024-04-04 17:14:57 48573 ts/yang-test-dataset/0.0.4
2024-04-04 17:14:58 48573 ts/yang-test-dataset/0.0.5
2024-04-04 17:14:57 48573 ts/yang-test-dataset/0.0.6
2024-04-04 17:14:58 48573 ts/yang-test-dataset/0.0.7
2024-04-04 17:14:57 48573 ts/yang-test-dataset/0.0.8
2024-04-04 17:14:57 48573 ts/yang-test-dataset/0.0.9
So from the perspective of an observer (who may eventually want to load this dataset again), the operation does not appear to be transactional. So when the blog says transactional with a single runtime, do you mean that the process's view of ds
when the context manager exits is transactional, but otherwise make no guarantees about the state of the underlying storage?
If one sets
with ts.Transaction(atomic=True) as txn:
...
then if a write would span multiple chunks, I see an error
ValueError: Cannot read/write "ts/yang-test-dataset/.zarray" and read/write "ts/yang-test-dataset/0.0.0" as single atomic transaction [source locations='tensorstore/internal/cache/kvs_backed_cache.h:221\ntensorstore/internal/cache/async_cache.cc:660\ntensorstore/internal/cache/async_cache.h:383\ntensorstore/internal/cache/chunk_cache.cc:438\ntensorstore/internal/grid_partition.cc:246\ntensorstore/internal/grid_partition.cc:246\ntensorstore/internal/grid_partition.cc:246']
I'm guessing this is expected since you have no way of performing a transactional write across multiple S3 objects?
Lastly, on the topic of "optimistic concurrency and compatibility with GCS/other storage layers", since AFAIK S3 does not support conditional PUTs the way that GCS does, is there a possibility of data loss when using S3?
Thanks in advance!
The S3 support was added recently but we indeed need to clarify the limitations in the documentation.
S3 lacks conditional write support and it is indeed possible with multiple concurrent writes to the same object that some writes will be lost.
There is a strategy for implementing atomic writes on S3 under certain assumptions on the timestamps, but it would require a list operation in order to read, which may be costly. When using this strategy with ocdbt, only a single list operation would be needed for the manifest, and subsequent reads (using the cached manifest) would be normal read operations, and multi-key atomic transactions could also be supported (currently a small amount of work remains to actually support both s3 and multi-key atomic operations with ocdbt).