numexpr
numexpr copied to clipboard
pandas testsuite with numpy 2.0.0rc1 fails on numexpr
I'm currently testing numpy 2 on the the openSUSE python ecosystem. I notice the pandas test suite failing when numpy 2.0.0rc1 is installed instead of 1.26.4:
[ 681s] =================================== FAILURES ===================================
[ 681s] _ TestTypeCasting.test_binop_typecasting[numexpr-python-left_right0-float64-/] _
[ 681s] [gw1] linux -- Python 3.11.8 /usr/bin/python3.11
[ 681s]
[ 681s] self = <pandas.tests.computation.test_eval.TestTypeCasting object at 0x7ff16c7666d0>
[ 681s] engine = 'numexpr', parser = 'python', op = '/', dt = <class 'numpy.float64'>
[ 681s] left_right = ('df', '3')
[ 681s]
[ 681s] @pytest.mark.parametrize("op", ["+", "-", "*", "**", "/"])
[ 681s] # maybe someday... numexpr has too many upcasting rules now
[ 681s] # chain(*(np.core.sctypes[x] for x in ['uint', 'int', 'float']))
[ 681s] @pytest.mark.parametrize("dt", [np.float32, np.float64])
[ 681s] @pytest.mark.parametrize("left_right", [("df", "3"), ("3", "df")])
[ 681s] def test_binop_typecasting(self, engine, parser, op, dt, left_right):
[ 681s] df = DataFrame(np.random.default_rng(2).standard_normal((5, 3)), dtype=dt)
[ 681s] left, right = left_right
[ 681s] s = f"{left} {op} {right}"
[ 681s] > res = pd.eval(s, engine=engine, parser=parser)
[ 681s]
[ 681s] pandas/tests/computation/test_eval.py:756:
[ 681s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
[ 681s] pandas/core/computation/eval.py:357: in eval
[ 681s] ret = eng_inst.evaluate()
[ 681s] pandas/core/computation/engines.py:81: in evaluate
[ 681s] res = self._evaluate()
[ 681s] pandas/core/computation/engines.py:121: in _evaluate
[ 681s] return ne.evaluate(s, local_dict=scope)
[ 681s] /usr/lib64/python3.11/site-packages/numexpr/necompiler.py:977: in evaluate
[ 681s] raise e
[ 681s] /usr/lib64/python3.11/site-packages/numexpr/necompiler.py:874: in validate
[ 681s] _names_cache[expr_key] = getExprNames(ex, context, sanitize=sanitize)
[ 681s] /usr/lib64/python3.11/site-packages/numexpr/necompiler.py:723: in getExprNames
[ 681s] ex = stringToExpression(text, {}, context, sanitize)
[ 681s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
[ 681s]
[ 681s] s = '(df) / (np.float64(3.0))', types = {}
[ 681s] context = {'optimization': 'aggressive', 'truediv': False}, sanitize = True
[ 681s]
[ 681s] def stringToExpression(s, types, context, sanitize: bool=True):
[ 681s] """Given a string, convert it to a tree of ExpressionNode's.
[ 681s] """
[ 681s] # sanitize the string for obvious attack vectors that NumExpr cannot
[ 681s] # parse into its homebrew AST. This is to protect the call to `eval` below.
[ 681s] # We forbid `;`, `:`. `[` and `__`, and attribute access via '.'.
[ 681s] # We cannot ban `.real` or `.imag` however...
[ 681s] # We also cannot ban `.\d*j`, where `\d*` is some digits (or none), e.g. 1.5j, 1.j
[ 681s] if sanitize:
[ 681s] no_whitespace = re.sub(r'\s+', '', s)
[ 681s] skip_quotes = re.sub(r'(\'[^\']*\')', '', no_whitespace)
[ 681s] if _blacklist_re.search(skip_quotes) is not None:
[ 681s] > raise ValueError(f'Expression {s} has forbidden control characters.')
[ 681s] E ValueError: Expression (df) / (np.float64(3.0)) has forbidden control characters.
[ 681s]
[ 681s] /usr/lib64/python3.11/site-packages/numexpr/necompiler.py:283: ValueError
...
[ 684s] FAILED pandas/tests/computation/test_eval.py::TestTypeCasting::test_binop_typecasting[numexpr-python-left_right0-float64-/]
[ 684s] FAILED pandas/tests/computation/test_eval.py::TestTypeCasting::test_binop_typecasting[numexpr-python-left_right1-float64-/]
[ 684s] FAILED pandas/tests/computation/test_eval.py::TestTypeCasting::test_binop_typecasting[numexpr-pandas-left_right0-float64-/]
[ 684s] FAILED pandas/tests/computation/test_eval.py::TestTypeCasting::test_binop_typecasting[numexpr-pandas-left_right1-float64-/]
[ 684s] FAILED pandas/tests/computation/test_eval.py::TestOperations::test_simple_arith_ops[numexpr-python]
[ 684s] FAILED pandas/tests/computation/test_eval.py::TestOperations::test_simple_arith_ops[numexpr-pandas]
This is a consequence of the sanitizers that numexpr implemented a few months ago. In general, it is considered not a good idea to call arbitrary functions inside numexpr expressions, so we encourage to rewrite that test as:
In [11]: ne.evaluate('(df) / b', {'b': np.float64(3.0)})
Out[11]: array([0.33333333, 0.66666667, 1. ])
Message to comment on stale issues. If none provided, will not mark issues stale
I am not sure why the pandas bug is still not picked up yet. Numpy as been released for a few weeks already and the regular pandas CI should have encountered the failure by now.
Message to comment on stale issues. If none provided, will not mark issues stale
This was fixed in pandas main and is being backported in https://github.com/pandas-dev/pandas/pull/59513
Message to comment on stale issues. If none provided, will not mark issues stale