BUG: `scipy.stats.ttest_1samp` raises an error when using `keepdims=True`
Describe your issue.
Setting keepdims=True when using scipy.stats.ttest_1samp leads to an error.
Reproducing Code Example
import numpy as np
from scipy import stats
rng = np.random.default_rng()
rvs = stats.uniform.rvs(size=(10, 10, 10), random_state=rng)
stats.ttest_1samp(rvs, popmean=0.5, axis=2, keepdims=True)
Error message
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/sdiebolt/Documents/Personal/scipy-error/.venv/lib/python3.12/site-packages/scipy/stats/_axis_nan_policy.py", line 565, in axis_nan_policy_wrapper
res = _add_reduced_axes(res, reduced_axes, keepdims)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sdiebolt/Documents/Personal/scipy-error/.venv/lib/python3.12/site-packages/scipy/stats/_axis_nan_policy.py", line 247, in _add_reduced_axes
return ([np.expand_dims(output, reduced_axes) for output in res]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sdiebolt/Documents/Personal/scipy-error/.venv/lib/python3.12/site-packages/numpy/lib/shape_base.py", line 597, in expand_dims
axis = normalize_axis_tuple(axis, out_ndim)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sdiebolt/Documents/Personal/scipy-error/.venv/lib/python3.12/site-packages/numpy/core/numeric.py", line 1380, in normalize_axis_tuple
axis = tuple([normalize_axis_index(ax, ndim, argname) for ax in axis])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
numpy.exceptions.AxisError: axis 2 is out of bounds for array of dimension 1
SciPy/NumPy/Python version and system information
1.13.0 1.26.4 sys.version_info(major=3, minor=12, micro=2, releaselevel='final', serial=0)
/home/sdiebolt/Documents/Personal/scipy-error/.venv/lib/python3.12/site-packages/scipy/__config__.py:154: UserWarning: Install `pyyaml` for better output
warnings.warn("Install `pyyaml` for better output", stacklevel=1)
{
"Compilers": {
"c": {
"name": "gcc",
"linker": "ld.bfd",
"version": "10.2.1",
"commands": "cc"
},
"cython": {
"name": "cython",
"linker": "cython",
"version": "3.0.10",
"commands": "cython"
},
"c++": {
"name": "gcc",
"linker": "ld.bfd",
"version": "10.2.1",
"commands": "c++"
},
"fortran": {
"name": "gcc",
"linker": "ld.bfd",
"version": "10.2.1",
"commands": "gfortran"
},
"pythran": {
"version": "0.15.0",
"include directory": "../../tmp/pip-build-env-giifv1os/overlay/lib/python3.12/site-packages/pythran"
}
},
"Machine Information": {
"host": {
"cpu": "x86_64",
"family": "x86_64",
"endian": "little",
"system": "linux"
},
"build": {
"cpu": "x86_64",
"family": "x86_64",
"endian": "little",
"system": "linux"
},
"cross-compiled": false
},
"Build Dependencies": {
"blas": {
"name": "openblas",
"found": true,
"version": "0.3.26.dev",
"detection method": "pkgconfig",
"include directory": "/usr/local/include",
"lib directory": "/usr/local/lib",
"openblas configuration": "USE_64BITINT=0 DYNAMIC_ARCH=1 DYNAMIC_OLDER= NO_CBLAS= NO_LAPACK= NO_LAPACKE= NO_AFFINITY=1 USE_OPENMP= ZEN MAX_THREADS=64",
"pc file directory": "/usr/local/lib/pkgconfig"
},
"lapack": {
"name": "openblas",
"found": true,
"version": "0.3.26.dev",
"detection method": "pkgconfig",
"include directory": "/usr/local/include",
"lib directory": "/usr/local/lib",
"openblas configuration": "USE_64BITINT=0 DYNAMIC_ARCH=1 DYNAMIC_OLDER= NO_CBLAS= NO_LAPACK= NO_LAPACKE= NO_AFFINITY=1 USE_OPENMP= ZEN MAX_THREADS=64",
"pc file directory": "/usr/local/lib/pkgconfig"
},
"pybind11": {
"name": "pybind11",
"version": "2.12.0",
"detection method": "config-tool",
"include directory": "unknown"
}
},
"Python Information": {
"path": "/opt/python/cp312-cp312/bin/python",
"version": "3.12"
}
}
I think I managed to find what the problem is, I'll try a PR if it's alright!
I'll try a PR if it's alright!
sure, go for it!
LMK if you're still working on this, otherwise I'll likely fix it tonight. There are some features that have been added since _add_reduced_axes was written that store non-array private attributes in the result object, and these are commplaining when we try to add dimensions to them. I don't think it's worth a very deep dive to fix, since the whole decorator will likely be rewritten with the array API work that is going on. The fastest fix would probably be to add a heuristic (which will become a new convention) to determine which attributes do not need dimensions added. For example, in this case, _alternative is a Python int rather than a NumPy scalar or array - maybe that can be the signal that dimensions should not be added to it. That way, we can restrict the changes to the _add_reduced_axes function itself. I don't think there are very many cases we need to consider, so as long as we document it and expand the tests, it should be fine. Currently, we're only testing keepdims for two functions - which is unusual because the rest of the features of the _axis_nan_policy decorator are tested for every function to which they're applied. I think this was done to save CI time. But we might as well add all the functions in there, at least as a slow test.
Indeed, I thought about adding such a heuristic, e.g., expanding dims only from arrays and not from any other types. However I was unsure of the impact of such a change and was trying to understand the implications, since _add_reduced_axes seemed to be used by many other functions. I won't be able to push a PR until tomorrow anyway, so if you're more comfortable with the impacts and able to do it tonight, go for it! 😉
No, go ahead and try that approach when you can, and I can review. I don't think you can only expand dims from arrays because you will want to expand NumPy scalar types, too. But see what happens if you just look out for Python ints e.g.
return ([(np.expand_dims(output, reduced_axes) if not isinstance(output, int) else output) for output in res]
if keepdims else res)
And add all the other functions in axis_nan_policy_cases to test_keepdims: https://github.com/scipy/scipy/blob/7dcd8c59933524986923cde8e9126f5fc2e6b30b/scipy/stats/tests/test_axis_nan_policy.py#L487
That should smoke out any other cases I'm not thinking of. But IIRC there should not be too many cases - it really might be just the few functions that return a result with this _alternative parameter.
gh-20734 fixes the bug for ttest_1samp, but I'll leave this open to remind us to extend the test to other functions and remove the warning filters.