awkward icon indicating copy to clipboard operation
awkward copied to clipboard

Could you please extend https://awkward-array.org/how-to-use-in-numba-arraybuilder.html

Open HDembinski opened this issue 3 years ago • 4 comments

Following the nice invite in the docs to raise an issue for unfilled topics that should get higher priority, I am doing that here.

I would like to write a numba function that returns an awkward array (specifically a ListOffsetArray) from other ListOffsetArrays. I need to run the equivalent of numpy.empty_like(...) and numpy.zeros_like(...) in numba-compiled Python with an awkward array. I suppose that filling the array is straight forward, I have trouble with the creation...

On a related note, is it possible to access the total size of a ListOffsetArray from within numba-compiled Python?

HDembinski avatar Nov 12 '20 12:11 HDembinski

Will do.

For zeros_like and ones_like, #493 will hopefully be done soon. (It's probably faster to write that function than the documentation on Numba, though the latter is also needed.)

By "total size of a ListOffsetArray," the len can be used in a Numbafied function, but maybe you mean the sum of lengths of inner arrays. Outside of JIT, you can do np.sum(ak.num(array)), but inside, you'd have to:

@nb.njit
def example(array):
    total = 0
    for subarray in array:
        total += len(subarray)

I tried various combinations with list comprehensions, to make it shorter and more idiomatic, but Numba complained about unsupported features.

I was going to profile the plain for loop against a loop comprehension, if it were even possible, so I'll at least share the results of the plain for loop:

>>> content = np.random.normal(size=1000000)
>>> offsets = np.arange(0, 1000001, 50)
>>> offsets[1::2] += 47
>>> ak_content = ak.layout.NumpyArray(content)
>>> ak_offsets = ak.layout.Index64(offsets)
>>> ak_listoffsetarray = ak.layout.ListOffsetArray64(ak_offsets, ak_content)
>>> array = ak.Array(ak_listoffsetarray)

>>> array     # an array that alternates between length-97 and length-3 lists
<Array [[-0.356, -0.825, ... 0.132, -0.651]] type='20000 * var * float64'>
>>> array[0]
<Array [-0.356, -0.825, ... 1.16, -1.93] type='97 * float64'>
>>> array[1]
<Array [0.74, 0.789, 0.483] type='3 * float64'>
>>> array[2]
<Array [-2.85, 0.835, 1.13, ... -1.94, -0.379] type='97 * float64'>
In [2]: @nb.njit
   ...: def example1(array):
   ...:     total = 0
   ...:     for subarray in array:
   ...:         total += len(subarray)
   ...:     return total
   ...: 
In [12]: %%timeit
    ...: example1(array)
    ...: 
    ...: 
57.7 µs ± 982 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

jpivarski avatar Nov 12 '20 14:11 jpivarski

One would have to look at the assembly for this. If the loop really just adds the sizes of the arrays, then it is fine. If the loop over the subarrays constructs intermediate objects just to get at the size and then destroys them, this is seems wasteful, especially since I can get the size in Python without doing a loop, by stops[-1] - starts[0] or something similar.

HDembinski avatar Nov 12 '20 18:11 HDembinski

Having written the assembly-generator for Awkward's Numba extension, it does not create intermediate lists, at least not any that copy array data (final contents or indexes like starts and stops). It does create stack-allocated structs of a fixed size (maybe a few dozen bytes) for each __getitem__; those structs contain pointers into the array buffers. Early on, I had to redesign to ensure that the size of this struct does not scale with the complexity of the type, let alone the size of the data it contains. Now the structs are completely fixed size:

https://github.com/scikit-hep/awkward-1.0/blob/ddab4313c30db001f0175406cfbe437855264292/src/awkward1/_connect/_numba/arrayview.py#L410-L421

It looks like 48 bytes. 48 bytes get stack-allocated with each __getitem__.

There's no reference counting: a single reference to the Python view of the array is kept throughout the JITed function execution, so everything within is a safe borrow. Iteration creates a single Iterator whose index updates in place, but then data are extracted at each step of iteration with a __getitem__ from that index, so for x in array would have the same performance as for i in range(len(array)) and array[i].

jpivarski avatar Nov 12 '20 18:11 jpivarski

That seems ok then. As a designer I would personally be bothered by this, but as a user of the library, this is completely fine.

HDembinski avatar Nov 12 '20 18:11 HDembinski