fix: add cuda backend support for `to_raggedtensor` and `from_raggedtensor` functions
Codecov Report
Attention: Patch coverage is 12.24490% with 43 lines in your changes missing coverage. Please review.
Project coverage is 82.17%. Comparing base (
b749e49) to head (5ee7f0c). Report is 179 commits behind head on main.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| src/awkward/operations/ak_to_raggedtensor.py | 13.33% | 26 Missing :warning: |
| src/awkward/operations/ak_from_raggedtensor.py | 10.52% | 17 Missing :warning: |
Additional details and impacted files
| Files with missing lines | Coverage Δ | |
|---|---|---|
| src/awkward/operations/ak_from_raggedtensor.py | 22.72% <10.52%> (ø) |
|
| src/awkward/operations/ak_to_raggedtensor.py | 21.81% <13.33%> (ø) |
@jpivarski while trying to make the to_raggedtensor function keep the device of the original awkward array I stumbled upon an issue. The thing is, tensorflow automatically selects gpu for computation, if it's available. And if I try to run the following code on gpu, it does return a tensor on cpu:
import tensorflow as tf
def function():
with tf.device('CPU:0'):
a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
return a
a = function()
a.device
>>/job:localhost/replica:0/task:0/device:CPU:0
However if try to do the same with the to_raggedtensor function, the intermediate ragged tensor is allocated on cpu (the line 78 print says that it's on cpu) but the resulting tensor is allocated on gpu:
to_raggedtensor(ak.Array([[[1.1, 2.2], [3.3]], [], [[4.4, 5.5]]]))[0][0].device
>>/job:localhost/replica:0/task:0/device:GPU:0
Should I make the function use a TensorFlow policy and automatically select a device or create some kind of workaround?
ak.to_raggedtensor should return a RaggedTensor on the same device as the Awkward Array, as a view (no copy) if possible. That may mean that the implementation needs to specify non-default arguments of the RaggedTensor constructor (or use the with block) in order to control it.
If this is not possible and TensorFlow returns an object whose backend depends on what hardware is available (a terrible practice! shame on TensorFlow!), then we'll have to explain that (apologetically) in our documentation.