webknossos-libs icon indicating copy to clipboard operation
webknossos-libs copied to clipboard

Small writes to compressed sharded zarr3 mags are slow

Open fm3 opened this issue 2 years ago • 0 comments

Writing a small data shape to a compressed Zarr3 array seems to be significantly slower than for WKW:

Writing shape (24, 24, 24) to zarr3 mag...
took 2.5s

Writing shape (24, 24, 24) to wkw mag...
took 0.5s

with this test script:

import shutil
import webknossos as wk
import numpy as np
import time
import os

def main():
    data_format = wk.DataFormat.Zarr3

    ds_path = "output-test"
    if os.path.exists(ds_path):
        shutil.rmtree(ds_path)

    data = (np.random.rand(24, 24, 24) * 255).astype(np.uint8)
    dataset = wk.Dataset(ds_path, voxel_size=(1,1,1))

    test_layer = dataset.add_layer(
        layer_name="color",
        category="color",
        data_format=data_format,
    )
    test_mag = test_layer.add_mag("1", compress=True)

    print(f"Writing shape {data.shape} to {data_format} mag...")
    before = time.time()
    test_mag.write(
        absolute_offset=(0, 0, 0),
        data=data,
    )

    after = time.time()
    print(f"took {after - before:.1f}s")


if __name__ == '__main__':
    main()

I noticed this during downsampling, which does exactly such writes and compresses by default. Smaller chunks_per_shard helps, as does compress=False for the output mag

Not certain if this should be a zarrita issue instead of wk-libs

Note that before the optimization introduced in https://github.com/scalableminds/webknossos-libs/pull/963 the same code took 15s

fm3 avatar Nov 07 '23 14:11 fm3