numexpr
numexpr copied to clipboard
calculation of sum is slow and uses only one core
From [email protected] on February 19, 2012 03:06:02
- a = numpy.random.random((10000,10000))
- numexpr.evaluate("sin(a) + exp(a) + log(a + 3)").sum() - fast, uses all cores
- numexpr.evaluate("sum(sin(a) + exp(a) + log(a + 3))") - slow, uses one core
I often use sum for expressions like sum(exp(a[:,None]*b[None,:])), where two vectors are passed to numexpr, and one number is an output. It would be great to avoid creation of an array a[:,None] * b[None,:] at all.
numexpr: 2.0.1, numpy:1.6.1, OS: ubuntu 11.04
Original issue: http://code.google.com/p/numexpr/issues/detail?id=73
From [email protected] on March 03, 2012 23:16:39
I can reproduce this issue. It looks like sum (even on a single array) is not using multiple threads and anything inside the sum won't be accelerated.
What would be the best way to code this? I'll be happy to help.
Thanks.
From [email protected] on July 31, 2012 08:59:38
+1 In a use case I am encountering, the numexpr.evaluate("sum(a)") version is takes over 60s to complete, uses only one core, BUT keeps the memory usage quite low. OTOH, the numexpr.evaluate("a").sum() version takes just a few seconds to complete, uses many cores, BUT uses as much as 15GB of memory, albeit only momentarily.
This is a very substantial defect which appears to affect multiple users.
From [email protected] on July 31, 2012 11:05:10
I don't think default ndarray.sum() method is capable of using more than one core. The dirty workaround I use for myself now is to use parallel sum using OpenMP and weave.inline Here's the function...
def openmpSum(in_array):
"""
Performs fast sum of an array using openmm
"""
from scipy import weave
a = numpy.asarray(in_array)
b = numpy.array([1.])
N = int(numpy.prod(a.shape))
code = r"""
int i=0;
double sum = 0;
omp_set_num_threads(4);
#pragma omp parallel for \
default(shared) private(i)
reduction(+:sum)
for (i=0; i<N; i++)
sum += a[i];
b[0] = sum;
"""
weave.inline(code, ['a','N','b'],
extra_compile_args=['-march=native -O3 -fopenmp ' ],
support_code = r"""
#include <stdio.h>
#include <omp.h>
#include <math.h>""",
libraries=['gomp'])
return b[0]
These reports date to mid-2012 but I am seeing this issue with the latest version of numexpr. Has any progress been made in solving it? Otherwise this library is extremely nice, but this defect makes it impossible for me to use on my project.
Right now numexpr is in pure maintenance mode. I typically still have time though to revise pull requests and merge them if appropriate, but not much more than this. So if this is something that you want to see in numexpr, you still could send a PR and I would revise it.
Thanks for the clarification. I'll look into it and send a PR if I can.
Thanks for keeping the issue opened as it is still present.
Any news on this issue? evaluate('sum(X, 2)')
with X being a 4d ndarray
is slower than NumPy and only uses one core for me.
Message to comment on stale issues. If none provided, will not mark issues stale