Auto-Synced-Translated-Dubs
Auto-Synced-Translated-Dubs copied to clipboard
Implement DeepL translation
Type of change
- [x] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [x] Breaking change (fix or feature that would cause existing functionality to not work as expected)
- [x] This change requires a documentation update
Proposed Changes
- Implement DeepL translation service
- Logic for batch translation
- Settings.
- Introduced Python package DeepL which is the official package for the API.
- Documented DeepL in the README
- Add logic to not require DeepL key if using google translate, and vice versa, not requiring Google API key or credentials if using DeepL + Azure.
- Fix an issue with batch translation in Google API causing the array to go out of bounds due to a comparison between the chunks of text versus the whole list of texts to translate. For more info, check #26
I've extensively tested everything in this PR, I tried to cover every path and possible way for the code to run, however, since this is a big change, it would be great to test it on an environment different than mine.
Additional Info
- Fixes #26
Checklist:
- [x] I have commented my code, particularly in hard-to-understand areas
- [x] I have tested that the code works
Required:
- By contributing to this project, you agree to the terms of the GPLv3 license, and agree to grant the project owner the right to also provide or sell this software, including your contribution, to anyone under any other license, with no compensation to you.
Cool, I don't see any obvious issues, I should be able to fully test it out in the next couple days.
Since DeepL doesn't support as many languages as Google Translate, I can add some kind of fallback to use Google Translate in the case a language isn't supported, like Arabic, Hindi, Korean, etc. Should be easy enough to just hard code a list to check against.
Awesome!
It would also be possible to ask the API for the supported languages and do a dynamic fallback, that way it would stay updated with API changes on DeepL's side.
However, there are still some things that require attention, for example, the language en
is no longer supported, only en-US
or en-GB
, the same applies to pt
with pt-BR
and pt-PT
. I don't know why are they listed as backward support, but the library generates a deprecation exception when using those.
That's awesome! We are using deepl to do translation of subtitles for Khan Academy, but actually run those then trough Amara.og with a manual review process. So probably will not be able to use the integration here.
Did you consider making the used translation service configurable in config.ini? E.g. change skip_translation = False to translation_service = { none, Google, Deepl}