dask-image icon indicating copy to clipboard operation
dask-image copied to clipboard

jp2 slicing

Open YangForever opened this issue 1 year ago • 1 comments

Describe the issue: Hi, I am trying to use dask_image.imread.imread() to speed up the sub-volume extraction on my dataset of .jp2 format, but it throws an error when I extract multiple slices.

Psudocode Example:

import dask_image.imread

def read_subvol_stack(path, file_type):
    img_array_dask = dask_image.imread.imread(f"{path}/*.{file_type}")
    print(img_array_dask)
    img_array = img_array_dask[0:900, :, :].compute()
    return img_array

The print() function gives:

dask.array<_map_read_frame, shape=(3770, 1898, 1898), dtype=uint16, chunksize=(1, 1898, 1898), chunktype=numpy.ndarray>

When the compute() is running, an error occurs:

ValueError: could not broadcast input array from shape (1,1898,1898) into shape (1,1,1898)

I have checked, when compute() function activated, the imread() will read an image in shape of (1,1,1898, 1898), so the chunk size of (1, 1898, 1898) can’t be broadcast.

The code works well for .tif or .png images.

Anything else we need to know?: I replicate the error if it helps: https://github.com/YangForever/DaskImageSlicing/tree/main

Environment:

  • Dask version: 2023.11.0
  • Python version: 3.9
  • Operating System: Ubuntu 20.04
  • Install method (conda, pip, source): conda

YangForever avatar Mar 26 '24 13:03 YangForever

@YangForever thanks for reporting this issue 🙏

Actually, there are several known problems with the current dask_image.imread implementation (see https://github.com/dask/dask-image/issues/229) and we recommend considering one of the readers mentioned here.

m-albert avatar May 08 '24 16:05 m-albert