pytorch3d icon indicating copy to clipboard operation
pytorch3d copied to clipboard

Optimize list_to_packed to avoid for loop

Open Ruishenl opened this issue 1 year ago • 5 comments

For larger N and Mi value (e.g. N=154, Mi=238) I notice list_to_packed() has become a bottleneck for my application. By removing the for loop and running on GPU, i see a 10-20 x speedup.

Ruishenl avatar Feb 22 '24 21:02 Ruishenl

The function needs to work with items of different lengths (Mi different for each i). I think you need to sum over the lengths, and then this should work.

bottler avatar Feb 23 '24 10:02 bottler

The function needs to work with items of different lengths (Mi different for each i). I think you need to sum over the lengths, and then this should work.

Updated.

Ruishenl avatar Feb 24 '24 01:02 Ruishenl

@bottler has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot avatar Feb 26 '24 09:02 facebook-github-bot

@Ruishenl has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot avatar Mar 12 '24 22:03 facebook-github-bot

@bottler has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot avatar Mar 13 '24 11:03 facebook-github-bot

@bottler merged this pull request in facebookresearch/pytorch3d@ccf22911d4daa74af7fbf70b3373bc0fe46d6d7c.

facebook-github-bot avatar Apr 02 '24 14:04 facebook-github-bot

@Ruishenl Thank you!

bottler avatar Apr 02 '24 14:04 bottler