tensorstore icon indicating copy to clipboard operation
tensorstore copied to clipboard

Documentation: Performance Comparison

Open jhnnsrs opened this issue 2 years ago • 4 comments

Cool job, unifying these python projects. I was wondering if there would be a way of documenting the performance impact of using this library over e.g a vanilla python zarr implementation. In a perfect world over a few scenarios like heavily io-bound scenarios like http-stores/drivers?

jhnnsrs avatar Oct 03 '22 08:10 jhnnsrs

+1 for this. Also would be nice to have something about pros/cons of e.g. zarr vs n5 drivers...

harpone avatar Oct 27 '22 09:10 harpone

+1 - I am curious what performance impact one could expect

Axel-Jacobsen avatar Apr 08 '23 20:04 Axel-Jacobsen

zarr vs n5 performance will be very similar. In general I would recommend zarr over n5 since zarr is becoming more standard and offers a more functionality.

Compared to zarr-python, tensorstore is better able to take advantage of multiple CPU cores, and multiple concurrent IO operations, without relying on additional layers like dask for parallelism. In general with zarr-python it is usually necessary to use very large chunk sizes to get reasonable performance; that is not generally the case with tensorstore.

Of course actual benchmarks would be better than these anecdotes, but we haven't run anything systematic yet.

jbms avatar Apr 09 '23 03:04 jbms

This is helpful, thank you!

JackKelly avatar Apr 27 '23 14:04 JackKelly