FBGEMM icon indicating copy to clipboard operation
FBGEMM copied to clipboard

Allow jagged_index_select to accept pre-computed output shape

Open mrshenli opened this issue 1 year ago • 4 comments

Summary: jagged_index_select's CPU kernel API already accepts num_dense_output_rows as an argument. Generalize this to the CUDA kernel as well, which can to avoid a CPU-blocking .item() call in the CUDA kernel if users decided to pre-compute it.

Differential Revision: D54085880

mrshenli avatar Feb 23 '24 19:02 mrshenli