DeepSpeed icon indicating copy to clipboard operation
DeepSpeed copied to clipboard

Add RTS and token masking to top-2 gating + configurable jitter epsilon

Open ykim362 opened this issue 3 years ago • 2 comments

  1. Add Random Token Selection to top-2 gating
  2. Add token masking to top-2 gating
  3. Add no drop token to top-2 gating
  4. Add configurable jitter epsilon (both top-1 and top-2)

ykim362 avatar Jan 17 '22 01:01 ykim362

Hi @awan-10. I am trying to add some missing features to top-2 gating which are currently only available for top-1 gating. Please take a look and let me know what you think.

ykim362 avatar Jan 17 '22 01:01 ykim362

Can one of the admins verify this patch?

rocm-mici avatar Jun 09 '22 20:06 rocm-mici

Hi @ykim362 - is this still an issue or a PR you'd like to see completed? If so we can fix the conflicts and review, otherwise we would like to close and clean up some older PRs.

loadams avatar Sep 01 '23 17:09 loadams

Hi @ykim362 - I'm going to close this PR for now. If this is still something you'd like to merge, we'd be happy to review promptly next time, just re-open and we will take a look. Thanks for contributing to DeepSpeed!

loadams avatar Sep 06 '23 18:09 loadams