numexpr icon indicating copy to clipboard operation
numexpr copied to clipboard

numexpr leaves arrays in C- nor Fortran ordering

Open markvdw opened this issue 9 years ago • 5 comments

This is more of a question, rather than an issue, but I couldn't find a reference to a users list, so I hope it's ok here.

After rewriting code to use numexpr, I noticed that is slowed down some numba code elsewhere significantly, unless I forced the output to be in C contiguous form. Upon checking the flags, I noticed that the array output wasn't F contiguous either. I can definitely see how this would influence numba code, but would you expect later numpy code to experience a similar slowdown? Given the adage that numpy performs best when given C contiguous arrays...

markvdw avatar Feb 04 '16 10:02 markvdw

Uh, could you be a more explicit on what your issue is? The best is to send some small code (few lines) that reproduce the problem. Thanks.

FrancescAlted avatar Feb 04 '16 12:02 FrancescAlted

With fresh eyes, I have found the issue this morning. For interest, here is how to reproduce the problem I was having:

import numexpr as ne
a = rnd.randn(1000, 32)
a = np.asfortranarray(a)  # I loaded a matrix from file, which (unknown to me) was Fortran contiguous
b = rnd.randn(103, 32)
d = ne.evaluate("(a - b) / ard")
print d.flags

Giving:

  C_CONTIGUOUS : False
  F_CONTIGUOUS : False

This caused poor performance in numba code that followed. The root cause, however, was the Fortran contiguous array a (which in my code was loaded from disk). My incorrect understanding was that numexpr was outputting a weirdly ordered array for no reason and without a message, possibly causing problems with further numpy/numba code.

markvdw avatar Feb 04 '16 13:02 markvdw

Hmm, I can't reproduce your result. What's 'ard'? Anyway, a simplified version follows:

In [1]: import numpy as np  

In [2]: import numexpr as ne

In [3]: a = np.random.randn(1000, 32)

In [4]: a = np.asfortranarray(a)

In [5]: b = np.random.randn(1000, 32)

In [6]: a.flags
Out[6]: 
  C_CONTIGUOUS : False
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

In [7]: b.flags
Out[7]: 
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

In [8]: ne.evaluate("(a - b)").flags
Out[8]: 
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

Which seems reasonable to me.

FrancescAlted avatar Feb 04 '16 13:02 FrancescAlted

Apologies again,

import numexpr as ne
a = rnd.randn(1000, 32)[:, None, :]
a = np.asfortranarray(a)  # I loaded a matrix from file, which (unknown to me) was Fortran contiguous
b = rnd.randn(103, 32)
ard = rnd.randn(32)
d = ne.evaluate("(a - b) / ard")
print d.flags

Gives me:

In [1]: import numexpr as ne

In [2]: a = rnd.randn(1000, 32)[:, None, :]

In [3]: a = np.asfortranarray(a)  # I loaded a matrix from file, which (unknown to me) was Fortran contiguous

In [4]: b = rnd.randn(103, 32)

In [5]: ard = rnd.randn(32)

In [6]: d = ne.evaluate("(a - b) / ard")

In [7]: print d.flags
  C_CONTIGUOUS : False
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

markvdw avatar Feb 04 '16 14:02 markvdw

Hmm, yes, having both C_CONTIGUOUS and F_CONTIGUOUS as False is indeed bad. Not sure how to fix that though.

FrancescAlted avatar Feb 05 '16 12:02 FrancescAlted

Message to comment on stale issues. If none provided, will not mark issues stale

github-actions[bot] avatar Feb 22 '24 01:02 github-actions[bot]