fairseq icon indicating copy to clipboard operation
fairseq copied to clipboard

UnitY implementation

Open hirofumi0810 opened this issue 2 years ago • 0 comments

Before submitting

  • [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
  • [x] Did you read the contributor guideline?
  • [ ] Did you make sure to update the docs?
  • [ ] Did you write any new necessary tests?

What does this PR do?

This PR supports a new speech-to-speech translation models based on two-pass decoders.

  • UnitY (text->unit)
  • Translatotron2 (text->spectrogram)

In addition to support new models, I also made several updates that may affect other tasks,

  • Support R-Drop regularization (https://arxiv.org/abs/2106.14448)
  • Refactor sequence_generator.py
  • Support dual cross-attention for Transformer decoder
  • Support multi-task learning with arbitrary auxiliary tasks following the same design as S2UT models

Each of them is necessary to obtain the best result with UnitY.

PR review

Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

hirofumi0810 avatar Aug 26 '22 01:08 hirofumi0810