i18n-manager
i18n-manager copied to clipboard
Syntax Highlighting (ICU and other Formats)
Overview
- Add syntax highlighting for formats like ICU and others, to help the translator/developer with complex translations with lots of variables (see examples below)
- Need to list the most used formats and know how to tokenize them.
- Something similar to what is done in this ICU editor:
- https://format-message.github.io/icu-message-format-for-translators/editor.html
- where is shows visually that, blue is a variable (not to be translated) and black and the words that should be translated
- Another software on ICU Format: http://guigrpa.github.io/mady/
- Observation: This issue started as another proposal (see edits for info).
Examples:
- Simple example with first & lastName Variables:
-
Hello, {firstName} {lastName}!
- Only "Hello" should be edited. rest is locked
-
-
- Complex Example with:
-
{type, select, ACTIVITY {Activity(ies)} other {item(s)}} per volunteer
- Only "Activity(ies)", "item(s)" & "per volunteer" should be edited. rest is locked
-
-
Hey @iurikothe,
Indeed this is a very important thing, but we have some problems with that:
- we can't just split the tokens and after translating the phrase parts individually
-
- some languages have different phrasal structure, like Japanese, that translating "Welcome {name}!" should result in "{name} へようこそ!", in that case, "{name}" is in different positions, so splitting and after joining again would result in bad translations
- the ICU message format is very complex in some cases, and to parse and separate it is very difficult, and remembering that the i18n manager is message format agnostic, since it supports file formats that may or may not have the ICU format
-
- one example is the ".arb" files, that are json format for Flutter internationalization, and it doesn't use the ICU format, but the string interpolation feature, so the "hello example" should be "Hello $name!" instead of the curly braces pattern
- and finally and not less important, this tool uses the Google Translate API, and even today it doesn't support translating this kind of source, due to the huge amount of ways to tokenize the phrases
Thanks for the info, Gilmar! The best solution seems to be really very complex.
Maybe an easier and still really good feature would be showing syntax highlights from different formats (ICU, etc). This way, the human translator / developer would get a visual help to fix and see the variables. Even after running Googles API, one could fix the translations with this visual help.
And specifically for the ICU Format, we could find the syntax highlight in the github projects i've posted above. Other formats should have this code available too.
What do you think? Maybe add this as a low priority feature in the roadmap?
Yep, indeed it's very difficult
Good idea, syntax highlighting always is a good feature. Need to list the most used formats and know how to tokenize them. Right now the next release has some big changes and will be a major release (3.0)
I think that this feature can be put in 3.1 maybe
Perfect! i've renamed this issue and updated the first comment
Amazing!