tensorstore
tensorstore copied to clipboard
Documentation: Performance Comparison
Cool job, unifying these python projects. I was wondering if there would be a way of documenting the performance impact of using this library over e.g a vanilla python zarr implementation. In a perfect world over a few scenarios like heavily io-bound scenarios like http-stores/drivers?
+1 for this. Also would be nice to have something about pros/cons of e.g. zarr vs n5 drivers...
+1 - I am curious what performance impact one could expect
zarr vs n5 performance will be very similar. In general I would recommend zarr over n5 since zarr is becoming more standard and offers a more functionality.
Compared to zarr-python, tensorstore is better able to take advantage of multiple CPU cores, and multiple concurrent IO operations, without relying on additional layers like dask for parallelism. In general with zarr-python it is usually necessary to use very large chunk sizes to get reasonable performance; that is not generally the case with tensorstore.
Of course actual benchmarks would be better than these anecdotes, but we haven't run anything systematic yet.
This is helpful, thank you!