fairseq
fairseq copied to clipboard
UnitY implementation
Before submitting
- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [x] Did you read the contributor guideline?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?
What does this PR do?
This PR supports a new speech-to-speech translation models based on two-pass decoders.
- UnitY (text->unit)
- Translatotron2 (text->spectrogram)
In addition to support new models, I also made several updates that may affect other tasks,
- Support R-Drop regularization (https://arxiv.org/abs/2106.14448)
- Refactor sequence_generator.py
- Support dual cross-attention for Transformer decoder
- Support multi-task learning with arbitrary auxiliary tasks following the same design as S2UT models
Each of them is necessary to obtain the best result with UnitY.
PR review
Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃