machine-learning-notes
machine-learning-notes copied to clipboard
Incorrect benchmark of numpy and arrow backends
There are a couple issues in this notebook that you can change to provide a better comparison between numpy and arrow. Most importantly, you need to make the numpy array a fortran array with:
np.asfortranarray(numbers)
Next, when summing with numbers.sum()
, you are summing over both axes. It sums every value in each axis producing a single result. You need to do comparisons across each axis numbers.sum(axis=0)
and numbers.sum(axis=1)
. You will see that arrow is 1000x slower when summing across the horizontal axis.