machine-learning-notes icon indicating copy to clipboard operation
machine-learning-notes copied to clipboard

Incorrect benchmark of numpy and arrow backends

Open tdpetrou opened this issue 1 year ago • 0 comments

There are a couple issues in this notebook that you can change to provide a better comparison between numpy and arrow. Most importantly, you need to make the numpy array a fortran array with:

np.asfortranarray(numbers)

Next, when summing with numbers.sum(), you are summing over both axes. It sums every value in each axis producing a single result. You need to do comparisons across each axis numbers.sum(axis=0) and numbers.sum(axis=1). You will see that arrow is 1000x slower when summing across the horizontal axis.

tdpetrou avatar Apr 06 '23 13:04 tdpetrou