Introduce `gather()` and `scatter_add()` reference functions for token-routing in Mixture-of-Experts models.
Introduce gather() and scatter_add() reference functions for token-routing in Mixture-of-Experts models.
@jesselu-google do you want to include some simple moe test to cover this gather/scatter_add?
unit tests are included in an internal-only test file, let me know if you think it's worth it to replicate
This PR has been automatically marked as stale because it has not had recent activity. It will be closed soon if no further activity occurs. Thank you for your contributions.
This PR was closed because it has been inactive for a while. Please reopen it if you are still working on it.