silx
silx copied to clipboard
Histogramnd fails on large arrays
Histogramnd crashes on arrays with more than 2**31 - 1 elements:
import numpy as np
from silx.math import Histogramnd
def test_histogram(n_samples, n_bins):
data = np.random.randint(0, high=1000, size=(n_samples,), dtype=np.int32)
dmin, dmax = (0, 1000)
histogrammer = Histogramnd(
data, n_bins=n_bins, histo_range=(dmin, dmax), last_bin_closed=True
)
return histogrammer.histo, histogrammer.edges[0]
test_histogram(2**31 - 1, 100) # succeeds
test_histogram(2**31, 100) # fails
This is probably due to the data type for indices (i_n_elem) in the histogramnd functions:
int TEMPLATE(histogramnd, HISTO_SAMPLE_T, HISTO_WEIGHT_T, HISTO_CUMUL_T)
(HISTO_SAMPLE_T *i_sample,
HISTO_WEIGHT_T *i_weights,
int i_n_dim,
int i_n_elem, // <== should be size_t
Using int for i_n_dim should be fine.