silx icon indicating copy to clipboard operation
silx copied to clipboard

Histogramnd fails on large arrays

Open pierrepaleo opened this issue 3 years ago • 0 comments

Histogramnd crashes on arrays with more than 2**31 - 1 elements:

import numpy as np
from silx.math import Histogramnd

def test_histogram(n_samples, n_bins):
    data = np.random.randint(0, high=1000, size=(n_samples,), dtype=np.int32)
    dmin, dmax = (0, 1000)
    histogrammer = Histogramnd(
        data, n_bins=n_bins, histo_range=(dmin, dmax), last_bin_closed=True
    )
    return histogrammer.histo, histogrammer.edges[0]

test_histogram(2**31 - 1, 100) # succeeds
test_histogram(2**31, 100) # fails

This is probably due to the data type for indices (i_n_elem) in the histogramnd functions:

int TEMPLATE(histogramnd, HISTO_SAMPLE_T, HISTO_WEIGHT_T, HISTO_CUMUL_T)
                        (HISTO_SAMPLE_T *i_sample,
                         HISTO_WEIGHT_T *i_weights,
                         int i_n_dim,
                         int i_n_elem,  // <== should be size_t

Using int for i_n_dim should be fine.

pierrepaleo avatar Feb 08 '22 15:02 pierrepaleo