hdf5plugin icon indicating copy to clipboard operation
hdf5plugin copied to clipboard

Performance regression for lz4 decompression after version 4.0.1

Open Dalbasar opened this issue 3 months ago • 1 comments

I noticed that the LZ4 decompression via hdf5plugin 4.1.0 and later is 5-6x slower than with hdfplugin 4.0.1, while the compression speed is very similar:

import time
import hdf5plugin
import h5py
import numpy as np
from io import BytesIO

test_data = np.ones((1024, 1024, 1024), np.uint8)

raw_buffer = BytesIO()

with h5py.File(raw_buffer, 'w') as f:
    compression_start_time = time.perf_counter()
    f.create_dataset('data', data=test_data, compression=hdf5plugin.LZ4())
    compression_time = time.perf_counter() - compression_start_time

with h5py.File(raw_buffer, 'r') as f:
    decompression_start_time = time.perf_counter()
    data = f['data'][:]
    decompression_time = time.perf_counter() - decompression_start_time

print(f"hdf5plugin {hdf5plugin.version}: "
      f"lz4 compression time {compression_time:.3f}s, "
      f"lz4 decompression_time: {decompression_time:.3f}s")

gives the following results for different hdf5plugin version with h5py 3.12.1 on Python 3.11.9 on Windows 10 (AMD Ryzen 7 5900X):

hdf5plugin 4.0.1: lz4 compression time 0.219s, lz4 decompression_time: 0.283s hdf5plugin 4.1.0: lz4 compression time 0.226s, lz4 decompression_time: 1.630s hdf5plugin 5.0.0: lz4 compression time 0.221s, lz4 decompression_time: 1.610s

I have seen similar results on Python 3.8 and 3.11 on Debian 12 with different h5py versions.

I would have expected a substantial speedup after updating to version 5.0 with update to lz4 1.10 with the new multithreaded decompression compared to 4.1.x, but the decompression speed for 4.1.x and 5.0 seems to be the same and not using multithreaded lz4 decompression.

Dalbasar avatar Oct 29 '24 21:10 Dalbasar