ATen Tensor factories (and functions) should accept Tensors/Scalars as arguments to reduce sync points (?)

Tensor factories (and functions) should accept Tensors/Scalars as arguments to reduce sync points (?)

Open c-hofer opened this issue 6 years ago • 0 comments

Currently if you want to create a new Tensor from specifications residing on gpu a sync is needed (imho): It would be greate if sthg like the following could work:

... my_func(...){ 
...
my_gpu_tensor = ...; // size N x 1
//now we want to create a new M x 1 tensor where M = my_gpu_tensor[N][0] 
auto new_size = Scalar(my_gpu_tensor[N][0]);
auto new_tensor = my_gpu_tensor.type({new_size}).

//or a Tensor of size my_gpu_tensor.slice(0, N-2) 
auto new_tensor_2 = my_gpu_tensor.type(my_gpu_tensor.slice(0, N-2).squeeze());
...
}

Currently I have to do first sthg like new_size = new_size.to<int>() to make it work. But from my understanding this introduces a device->host action. Hence, it interrupts the asynchronous nature of the gpu calls and it prevents me from calling my_func asynchronously on several streams and then waiting together.

As I am not deep enough in the ATen sources, is it technically possible with reasonable effort to make this work? Or is there actually a way to do this I did not see?

regards c.hofer

Jun 26 '18 06:06 c-hofer

ATen ATen copied to clipboard

Tensor factories (and functions) should accept Tensors/Scalars as arguments to reduce sync points (?)

ATen
ATen copied to clipboard