webknossos-libs
webknossos-libs copied to clipboard
Small writes to compressed sharded zarr3 mags are slow
Writing a small data shape to a compressed Zarr3 array seems to be significantly slower than for WKW:
Writing shape (24, 24, 24) to zarr3 mag...
took 2.5s
Writing shape (24, 24, 24) to wkw mag...
took 0.5s
with this test script:
import shutil
import webknossos as wk
import numpy as np
import time
import os
def main():
data_format = wk.DataFormat.Zarr3
ds_path = "output-test"
if os.path.exists(ds_path):
shutil.rmtree(ds_path)
data = (np.random.rand(24, 24, 24) * 255).astype(np.uint8)
dataset = wk.Dataset(ds_path, voxel_size=(1,1,1))
test_layer = dataset.add_layer(
layer_name="color",
category="color",
data_format=data_format,
)
test_mag = test_layer.add_mag("1", compress=True)
print(f"Writing shape {data.shape} to {data_format} mag...")
before = time.time()
test_mag.write(
absolute_offset=(0, 0, 0),
data=data,
)
after = time.time()
print(f"took {after - before:.1f}s")
if __name__ == '__main__':
main()
I noticed this during downsampling, which does exactly such writes and compresses by default. Smaller chunks_per_shard helps, as does compress=False for the output mag
Not certain if this should be a zarrita issue instead of wk-libs
Note that before the optimization introduced in https://github.com/scalableminds/webknossos-libs/pull/963 the same code took 15s