numpy icon indicating copy to clipboard operation
numpy copied to clipboard

BUG: masked std and median on unmasked array result in invalid masked array

Open maxnoe opened this issue 2 years ago • 3 comments

Describe the issue:

Since version 1.24, the code example below results in a masked array where the data array and the mask array don't have the same shape

Reproduce the code example:

import numpy as np
print(np.__version__)

rng = np.random.default_rng(0)

data = rng.normal(size=(2, 101))

data[:, 2] = np.nan

std = np.ma.std(data, axis=1)
median = np.ma.median(data, axis=1)

print("median:")
print(repr(median))
print("std:")
print(repr(std))

deviation = data - median[:, np.newaxis]

comparison = deviation < 0.5 * std[:, np.newaxis]

print(comparison.shape, comparison.mask.shape)
print(comparison)

Error message:

Output under 1.23:

1.23.5
median:
masked_array(data=[nan, nan],
             mask=False,
       fill_value=1e+20)
std:
masked_array(data=[--, --],
             mask=[ True,  True],
       fill_value=1e+20,
            dtype=float64)
(2, 101) (2, 101)
[[-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
  -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
  -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
  -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
  -- -- -- --]
 [-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
  -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
  -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
  -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
  -- -- -- --]]

Output under 1.25 (also 1.24):

1.25.2
median:
masked_array(data=[nan, nan],
             mask=False,
       fill_value=1e+20)
std:
masked_array(data=[--, --],
             mask=[ True,  True],
       fill_value=1e+20,
            dtype=float64)
(2, 101) (2, 1)
Traceback (most recent call last):
  File "/home/mnoethe/test_numpy_ma_std.py", line 23, in <module>
    print(comparison)
  File "/home/mnoethe/.local/conda/envs/numpy-1.25/lib/python3.10/site-packages/numpy/ma/core.py", line 3997, in __str__
    return str(self._insert_masked_print())
  File "/home/mnoethe/.local/conda/envs/numpy-1.25/lib/python3.10/site-packages/numpy/ma/core.py", line 3991, in _insert_masked_print
    _recursive_printoption(res, mask, masked_print_option)
  File "/home/mnoethe/.local/conda/envs/numpy-1.25/lib/python3.10/site-packages/numpy/ma/core.py", line 2437, in _recursive_printoption
    np.copyto(result, printopt, where=mask)
ValueError: could not broadcast where mask from shape (2,2) into shape (2,100)

Runtime information:

[{'numpy_version': '1.25.2', 'python': '3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:40:32) ' '[GCC 12.3.0]', 'uname': uname_result(system='Linux', node='e5b-dell-12', release='5.14.0-1051-oem', version='#58-Ubuntu SMP Fri Aug 26 05:50:00 UTC 2022', machine='x86_64')}, {'simd_extensions': {'baseline': ['SSE', 'SSE2', 'SSE3'], 'found': ['SSSE3', 'SSE41', 'POPCNT', 'SSE42', 'AVX', 'F16C', 'FMA3', 'AVX2'], 'not_found': ['AVX512F', 'AVX512CD', 'AVX512_SKX', 'AVX512_CLX', 'AVX512_CNL', 'AVX512_ICL', 'AVX512_SPR']}}, {'architecture': 'Haswell', 'filepath': '/home/mnoethe/.local/conda/envs/numpy-1.25/lib/libopenblasp-r0.3.23.so', 'internal_api': 'openblas', 'num_threads': 20, 'prefix': 'libopenblas', 'threading_layer': 'pthreads', 'user_api': 'blas', 'version': '0.3.23'}]

Context for the issue:

Most confusingly, the example above works fine with numpy 1.25 if the shape of the data array is (2, 100) (just one element smaller in the last dimension).

maxnoe avatar Aug 24 '23 16:08 maxnoe

I'm not entirely sure if this solves your problem but I think this is resolved in the most recent 2.0.0.dev0+git20230830.b73a5ae version. Screenshot 2023-09-01 at 8 01 05 PM

lvllvl avatar Sep 02 '23 00:09 lvllvl

Why do std and median have different masks?

Why is the median Nan unmasked but std masked?

maxnoe avatar Sep 02 '23 09:09 maxnoe

I noticed this bug and I’d like to take a closer look and see if I can provide a solution. It might because some issue in sqrt. I’ll work on a potential fix and submit a PR if I make progress.

fengluoqiuwu avatar Oct 19 '24 14:10 fengluoqiuwu