awkward
awkward copied to clipboard
LayoutBuilder in Numba is slower than ArrayBuilder in Numba
Version of Awkward Array
2.3.1
Description and code to reproduce
@jpivarski - as discussed, I'm looking into the issue. Indeed, there is nearly 7x difference between an ArrayBuilder in Numba and a LayoutBuilder in Numba (the tests run twice to account for a "warm up"):
import awkward as ak
import numba
import numpy as np
import awkward._connect.numba.arrayview
import awkward.numba.layoutbuilder as lb
ak.numba.register_and_check()
import time
MULTIPLIER = int(10e6)
print("MULTIPLIER", MULTIPLIER)
def test_Numpy_LayoutBuilder():
@numba.njit
def f3(x):
for i in range(MULTIPLIER):
x.append(1.1)
x.append(2.2)
x.append(3.3)
x.append(4.4)
x.append(5.5)
return x
l = lb.Numpy(np.float64)
b = f3(l)
def test_Numpy_ArrayBuilder():
@numba.njit
def f4(x):
for i in range(MULTIPLIER):
x.real(1.1)
x.real(2.2)
x.real(3.3)
x.real(4.4)
x.real(5.5)
return x
a = ak.highlevel.ArrayBuilder()
b = f4(a)
for function in test_Numpy_LayoutBuilder, test_Numpy_ArrayBuilder:
t1 = time.perf_counter(), time.process_time()
function()
t2 = time.perf_counter(), time.process_time()
print(f"{function.__name__}()")
print(f" Real time: {t2[0] - t1[0]:.2f} seconds")
print(f" CPU time: {t2[1] - t1[1]:.2f} seconds")
print()
t1 = time.perf_counter(), time.process_time()
function()
t2 = time.perf_counter(), time.process_time()
print(f"{function.__name__}()")
print(f" Real time: {t2[0] - t1[0]:.2f} seconds")
print(f" CPU time: {t2[1] - t1[1]:.2f} seconds")
print()
to build an array of five elements:
test_Numpy_LayoutBuilder()
Real time: 1.62 seconds
CPU time: 1.49 seconds
test_Numpy_LayoutBuilder()
Real time: 0.11 seconds
CPU time: 0.11 seconds
test_Numpy_ArrayBuilder()
Real time: 0.04 seconds
CPU time: 0.04 seconds
test_Numpy_ArrayBuilder()
Real time: 0.04 seconds
CPU time: 0.04 seconds
to build an array of 5x10e6 elements:
test_Numpy_LayoutBuilder()
Real time: 4.87 seconds
CPU time: 4.85 seconds
test_Numpy_LayoutBuilder()
Real time: 3.79 seconds
CPU time: 3.77 seconds
test_Numpy_ArrayBuilder()
Real time: 0.56 seconds
CPU time: 0.56 seconds
test_Numpy_ArrayBuilder()
Real time: 0.57 seconds
CPU time: 0.57 seconds
It looks like using a numba.typed.List
could improve a LayoutBuilder
performance:
MULTIPLIER 10000000
test_Numpy_LayoutBuilder()
Real time: 5.15 seconds
CPU time: 5.14 seconds
test_Numpy_LayoutBuilder()
Real time: 4.02 seconds
CPU time: 4.01 seconds
test_Numpy_ArrayBuilder()
Real time: 0.58 seconds
CPU time: 0.58 seconds
test_Numpy_ArrayBuilder()
Real time: 0.58 seconds
CPU time: 0.58 seconds
test_Numpy_TypedList()
Real time: 1.03 seconds
CPU time: 1.03 seconds
test_Numpy_TypedList()
Real time: 0.68 seconds
CPU time: 0.67 seconds
A benchmark from HDembinski. See the notebook
In [1]: import numba as nb
...: import numpy as np
...: import awkward as ak
...: print(f"{nb.__version__=}")
...: print(f"{ak.__version__=}")
...:
nb.__version__='0.58.0rc1'
ak.__version__='2.3.3'