python-blosc2 icon indicating copy to clipboard operation
python-blosc2 copied to clipboard

Are plugins Codecs availables ?

Open Artholgar opened this issue 2 years ago • 3 comments

Hi,

I was trying to use ZFP with python-blosc2 with a simple example as follow :

    a = np.arange(1e6, dtype="int32")

    b = blosc2.pack_array2(a, cparams={"codec": blosc2.Codec.ZFP_RATE, "clevel": 5, "typesize": 4"})

But I've got this error :

Traceback (most recent call last):
  File "D:\COPEX_DCC\ZARR\temp.py", line 44, in <module>
    b = blosc2.pack_array2(a,
  File "D:\COPEX_DCC\ZARR\venv\lib\site-packages\blosc2\core.py", line 444, in pack_array2
    return pack_tensor(arr, chunksize, **kwargs)
  File "D:\COPEX_DCC\ZARR\venv\lib\site-packages\blosc2\core.py", line 615, in pack_tensor
    schunk = blosc2.SChunk(chunksize=chunksize, data=arr, **kwargs)
  File "D:\COPEX_DCC\ZARR\venv\lib\site-packages\blosc2\schunk.py", line 234, in __init__
    super(SChunk, self).__init__(_schunk=sc, chunksize=chunksize, data=data, **kwargs)
  File "blosc2_ext.pyx", line 890, in blosc2.blosc2_ext.SChunk.__init__
RuntimeError: An error occurred while appending the chunks

The same code with a different codec works well, so is the usage of ZFP codec different or is it juste not yet available ?

Thanks for your help !

Artholgar avatar May 23 '23 10:05 Artholgar

Yep, I can reproduce this. It looks like a strange interaction between pack_array2 and the plugins because using ZFP with blosc2.compress has no issues.

FrancescAlted avatar May 23 '23 11:05 FrancescAlted

Ok thanks now it works ! But now I'm trying to tune my codec like this :

    a = np.arange(1e6, dtype="int32")

    b = blosc2.compress2(a, cparams={"codec": blosc2.Codec.ZFP_PREC, "codec_meta": 2, "clevel": 1, "typesize": 4})

But the codec_meta field doesn't seems to make any difference. I saw i C-blosc that there is a meta field to tune the codec, so I though that codec_meta was the equivalent for python.

Artholgar avatar May 23 '23 15:05 Artholgar

compress2 does not follow the more modern convention of using cparams. It is more convenient to use the NDArray container for that. Example:

import numpy as np
import blosc2

shape = (50, 50)
chunks = (49, 49)

# Create a NDArray from a NumPy array
array = np.random.normal(0, 1, np.prod(shape)).reshape(shape)
# Use ZFP_PREC codec
cparams = {"codec": blosc2.Codec.ZFP_PREC, "codec_meta": 2}
a = blosc2.asarray(array, chunks=chunks, cparams=cparams)
print(f"cratio for meta {cparams['codec_meta']}: {a.schunk.cratio:.2f}x")

cparams = {"codec": blosc2.Codec.ZFP_PREC, "codec_meta": 9}
a = blosc2.asarray(array, chunks=chunks, cparams=cparams)
print(f"cratio for meta {cparams['codec_meta']}: {a.schunk.cratio:.2f}x")

which has this output:

cratio for meta 2: 4.18x
cratio for meta 9: 2.53x

See more examples of the NDArray object at: https://github.com/Blosc/python-blosc2/tree/main/examples/ndarray

FrancescAlted avatar May 23 '23 17:05 FrancescAlted

Closing due to inactivity.

FrancescAlted avatar Jun 07 '24 16:06 FrancescAlted