gitea
gitea copied to clipboard
[Summary] Translation system overhaul
Cleanup translations
- [x] #27699
- #24488
- #24402
General overhaul
- [x] #30054
- #23863
- Convert locale files to other format due to INI limitations & limited support
- Switch translation platform from Crowdin to Weblate
Docs translation
- #27499
- #27530
- #24715
- #23316
- #17081
- #20309
Share some of my thoughts.
Many translations in non-English locale are out-dated or event incorrect, because they didn't get updates when the English messages change.
To make the translations clear, the first step is to git blame every key's change, remove all outdated keys from non-English locales, sync to Crowdin.
Many translations in non-English locale are out-dated or event incorrect, because they didn't get updates when the English messages change.
We need to invalidate all affected tokens in all languages when english changes, it's the only sane config. It's doubly problematic if the % tokens in the message change, which will break rendering.
Moved this out of 1.22 because a summary issue may cross multiple milestones.
I think a good first step is when someone with access to crowdin admin panel would take a look at integrating https://github.com/crowdin/github-action, replacing the current ini workflow. Maybe a small script could be written that outputs the ini format to ease the transition.
It would enable contributors to configure crowdin in-repo and likely gain a deeper understanding of the process in turn.
@go-gitea/technical-oversight-committee can anyone with crowdin admin get this started?
After fighting with the mess in our current translation files again and again, I think these are the steps we need to do:
-
- Cleanup translation strings massively, reduce amount of translations, get rid of deep nesting.
-
- Invalidate all translations whose source strings are newer.
-
- Convert to a better file format, e. g. JSON or YML
-
- Move translations over to Weblate
We could also do 3. independently, but I don't have an overview of how the files are parsed currently
One step I would like to do is to flatten the ini to make the keys greppable, e.g. turning
[foo]
bar.baz = qux
into
foo.bar.baz = qux
This can be easily done with a script and should be compatible with all the code. Question is just whether it might break anything regarding the crowdin sync.
Alright, I have some spare time for a moment, so let's finally go forward with this.
I propose the following steps:
- Clean up the translation files from all strings which are older than their source strings and thus outdated, then sync to Crowdin afterwards
as proposed by @wxiaoguang
- Flatten the ininstructure to have grepable strings (allowing for translation lint) and easier further migration
as proposed above by @silverwind
-
Convert translation files to a better format (probably JSON)
-
Examine the possibility of a switch to Weblate
Of course all these changes would be made with high caution and I plan to always double-check that we don't break Crowdin.
@go-gitea/technical-oversight-committee please review the plan and tell me if I can go forward with it
- Switching to weblate is still problematic as it does not support getting back to git repo only approved translations
- Switching to weblate is still problematic as it does not support getting back to git repo only approved translations
@lafriks Are you sure? Their docs sounds like they also have an approval system...
Yes but translations are pushed back to source code even before approval
It's been long open issue https://github.com/WeblateOrg/weblate/issues/3745
Unfortunately after a lot of research I had to find out that 1. is not possible to do automatically as Crowdin does not support deleting all translation strings missing in an uploaded file. I started unapproving the outdated strings manually on Crowdin, but it will take a while...
I now finished manually cleaning out translation strings which are newer than the source string, Should be done with next Crowdin pull :)