Francesc Alted comments

Results 320 comments of


                                            Francesc Alted

Provide entrypoint for xarray backend?

Hi @rabernat, thanks for your interest in Caterva. No, there were no plans to introduce groups. Having said that, we discussed that internally and we do think that this would...

Adding Blosc2 HDF5 filter

Sounds good. I've added a couple of notes about your experience with the optimal number of threads.

[MISC] Update guidelines on file formats and multidimensional arrays - for derivatives

> at the heart of Zarr is the [blosc1 meta-compressor](https://www.blosc.org/). however, blosc2 has been out for a number of years and shows [significant speed benefit](https://www.blosc.org/posts/caterva-slicing-perf/) over its predecessor, with more...

[MISC] Update guidelines on file formats and multidimensional arrays - for derivatives

> What container format is using blosc2? The goal is wide support, not the fastest compression algorithm. [Upstream developer speaking here] The format for Blosc2 is documented in a series...

[MISC] Update guidelines on file formats and multidimensional arrays - for derivatives

> > Just a note here. blosc2 is actually _backward_ compatible with blosc1 (i.e. blosc2 tools can read blosc1 data without problems), but it is not _forward_ compatible (i.e. blosc2...

Blosc2 Codec?

Numcodecs adopting Blosc2 would be great. BTW, what we recently released as 2.0 is Python-Blosc2, not C-Blosc2 (whose 2.0 release happened [1,5 years ago](https://github.com/Blosc/c-blosc2/releases/tag/v2.0.0)). For what is worth, we have...

Preliminary version of Blosc2 module

> @FrancescAlted, do you have a timeline for how much longer c-blosc1 will be maintained? Well, the Blosc team would be supporting C-Blosc1 for as long as possible. Having said...

Blosc2 LZ4 is 2.8x slower than Blosc LZ4

Well seen @t20100 ! Indeed, 5250 elements is very little for Blosc2, as it requires much larger chunksizes for allowing scalability. FWIW, to eliminate any doubt about hdf5plugin implementation, here...

Blosc2 LZ4 is 2.8x slower than Blosc LZ4

FWIW, if one still wants small chunks, it is better to use 1 single thread with Blosc/Blosc2. With the original array (8 MB): ``` $ BLOSC_NTHREADS=1 python compare-blosc-blosc2.py time blosc...

Blosc2 LZ4 is 2.8x slower than Blosc LZ4

Yes, performance for such a 'small' datasets tends to be quite dependent on the CPU. On my MacBook Air (M1 processor): ``` $ BLOSC_NTHREADS=1 python prova.py time blosc (h5py, chunks=None):...