GPU-based vectorized `SpecAug`
What does this PR do ?
Context: together with @galv we found feature normalization and specaug to take approx 30% of the total forward step time in Canary training, due to CPU-bottlenecked implementations. Feature normalization is addressed in #8964.
This PR adds a GPU based SpecAugment. The original implementation loops over every example and every mask, then waits on CPU RNG to sample the numbers. The new fast implementation still applies masks sequentially, but is vectorized on batch size and uses GPU's RNG. We found approx. 5x speedup (70ms -> 17ms in profiling, but both numbers include profiler overhead). I also added a flag to be able to revert to the old implementation in case anybody encounters a compatibility issue.
I validated visually that the new impl behavior is as expected.
Old:
New:
Collection: ASR
Changelog
- 5x faster SpecAugment to reduce typical forward step time by ~10%.
Usage
- You can potentially add a usage example below
# Add a code snippet demonstrating how to use this
Jenkins CI
The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.
There's no need to comment jenkins on the PR to trigger Jenkins CI.
The GitHub Actions CI will run automatically when the PR is opened.
To run CI on an untrusted fork, a NeMo user with write access must click "Approve and run".
Before your PR is "Ready for review"
Pre checks:
- [ ] Make sure you read and followed Contributor guidelines
- [ ] Did you write any new necessary tests?
- [ ] Did you add or update any necessary documentation?
- [ ] Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- [ ] Reviewer: Does the PR have correct import guards for all optional libraries?
PR Type:
- [x] New Feature
- [ ] Bugfix
- [ ] Documentation
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed. Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information
- Related to # (issue)