firefox-translations-training
firefox-translations-training copied to clipboard
[meta] Train harder to segment languages, like CJK languages
For harder to segment languages we have Chinese, Japanese, and Korean. We'll need to implement better tokenization support and segmentation support for these languages in order to train them. This work should happen after training a subset of the easier to segment language in #524.
### Perform basic training
- [ ] https://github.com/mozilla/firefox-translations-training/issues/740
- [ ] https://github.com/mozilla/firefox-translations-training/issues/76
- [ ] https://github.com/mozilla/firefox-translations-training/issues/752
- [ ] #424
- [ ] https://github.com/mozilla/firefox-translations-training/issues/745
- [ ] https://github.com/mozilla/firefox-translations-training/issues/747
- [ ] https://github.com/mozilla/firefox-translations-training/issues/746
- [ ] #45
- [x] Train a basic teacher model
### Implement advanced features
- [ ] https://github.com/mozilla/firefox-translations-training/issues/741
- [ ] https://github.com/mozilla/firefox-translations-training/issues/743
- [ ] https://github.com/mozilla/firefox-translations-training/issues/744
- [ ] https://github.com/mozilla/firefox-translations-training/issues/742
- [ ] https://github.com/mozilla/firefox-translations-training/issues/749
- [ ] https://github.com/mozilla/firefox-translations-training/issues/750
- [ ] https://github.com/mozilla/firefox-translations-training/issues/751
- [ ] https://github.com/mozilla/firefox-translations-training/issues/753
- [ ] https://github.com/mozilla/firefox-translations-training/issues/748
- [ ] Train a good quantized model
- [ ] https://github.com/mozilla/firefox-translations-training/issues/860
- [ ] https://github.com/mozilla/firefox-translations-training/issues/896
- [ ] https://github.com/mozilla/firefox-translations-training/issues/899
### Run production training
- [ ] Train Chinese
- [ ] Train Japanese
- [ ] Train Korean
Native Speakers
If you are a native speaker (L1 language) in any of these languages and want to help out, feel free to leave a comment on this issue or join us in Firefox Translations on matrix. We can always use help with qualitative model evaluation, and questions regarding language.