robotframework Add and document translations

RF 5.1 starts localization efforts with these ~two~ three enhancements:

#4096
#519
#4400

The issues above cover adding the support for translations. This issue exists for these purposes:

Actually adding translations for various languages.
Documenting what languages we support in the User Guide and elsewhere.
Collecting information about who has provided translations so that we can acknowledge them in the release notes.

The translation efforts have been lead by @leeuwe who also set a Crowdin page for collaboration. We also have a dedicated #localization channel on our Slack.

Jul 05 '22 13:07 pekkaklarck

As mentioned above, we use Crowdin for collaboration and the #localization Slack channel for general discussion. You may also create your own language specific Slack channels for discussion if needed.

Translations at Crowdin are stored to YAML files with a name like Language.yml. This format isn't directly supported by Robot Framework, but you can convert them to suitable format using the crowdin.py script that can be found at https://github.com/MarketSquare/localization/. If you have a language file from Crowdin like Czech.yml, you can convert it with the script like this:

python crowdin.py Czech.yml Czech.py

After that you can use the generated language file with Robot Framework like this:

robot --language Czech.py example.robot

The generated class can also be added to Robot's languages module via a pull request to get it included as a built-in language.

Jul 06 '22 15:07 pekkaklarck

PR #4377 added translations for these languages:

Portuguese (PT) and Brazilian Portuguese (PT-BR) by @HelioGuilherme66
German (DE) by @Snooz82 and @Noordsestern
Dutch (NL) by @pimjansen and @leeuwe
Czech (CS) by @MoreFamed
French (FR) by @lesnake

Thanks all for contributions!

Jul 14 '22 20:07 pekkaklarck

Two new translations:

Thai (TH) by @up1
Polish (PL) by @bhirsz

Thank you both!

Jul 19 '22 11:07 pekkaklarck

Two more translations:

Spanish (ES) by Miguel Angel Apolayo Mendoza
Simplified Chinese (ZH-CN) by @mawentao119 and @nixuewei

Thank you both! These and everything above will be included in RF 5.1 alpha 2.

Jul 21 '22 17:07 pekkaklarck

Ukrainian (UK) :ukraine: by @Sunshine0000000 added. It will be included in beta 1. Thanks for contribution!

Aug 10 '22 13:08 pekkaklarck

We are planning some changes to the language definition syntax used a bit. Currently headers, settings and BDD prefixes are specified like this:

class En(Language):
    setting_headers = {'Settings', 'Setting'}
    variable_headers = {'Variables', 'Variable'}
    ...
    library = 'Library'
    resource = 'Resource'
    ...
    bdd_prefixes = {'Given', 'When', 'Then', 'And', 'But'}

The plan is to change it to this:

class En(Language):
    settings_header = 'Settings'
    setting_header = 'Setting'
    variables_header = 'Variables'
    variable_header = 'Variable'
    ...
    library_setting = 'Library'
    resource_setting = 'Resource'
    ...
    given_prefix = {'Given'}
    when_prefix = {'When'}
    ...

Reasons for these changes:

Having one-to-one mapping with header names like we already have with settings makes it possible for external tools to make automatic translation.
Being able to specify headers as sets would allow using totally different setting names, not only singular and plural as in English. That could be rather confusing and thus limiting names, like we do with settings, makes sense.
Settings having _setting postfix makes it explicit they are settings. This is especially important in the future when we are likely to add more terms to translate. It's also consistent with _headers and _prefix postfixes.
Splitting BDD prefixes from one set to separate given_prefix, when_prefix, etc. allows translation between terms similarly as with headers.
BDD prefixes need to have set as the value, because with some languages the exact term to use depends on the context where it is used.

This kind of changes obviously break external tools, but we are still in alpha phase so that ought to be fine. We want to make these changes before the first beta and hope the API then stays stable. If you are a developer of an external tool using this API, let us know if the changes are problematic or if you have ideas for further enhancements.

UPDATE: As discussed below, we will only support translating the plural header forms so we'll only have settings_header and no setting_header.

Aug 12 '22 10:08 pekkaklarck

@pekkaklarck , I'd like to make an objection to your second point regarding limiting headers to singular and plural only. There are several languages where dual number of nouns is mandatory. From European languages, it is e.g. Slovenian; but it is mandatory also in Arabic with about half a billion of native speakers. By "mandatory" I mean that the speakers are obliged to use the form of a noun other than singular or plural when they are refering to exactly two things (test cases, variables etc.). I am afraid it can be confusing for them if they will be forced to use either singular, or plural form for exactly two variables. The situation in settings is different: one can state exactly one thing after the 'Library' or 'Resource' statement, so one possible form (denoting singular) only is perfect here.

If we allow people to use their own language which is written in their own script, why not to allow them to use their own grammatical rules?

Or, from a totally different point of view: if there really are some problems with unlimited number of grammatical numbers, limit them to only one. The rationale is: you can write unlimited and indefinite count of variables/settings/... into a header; thus the only allowed form of header name would denote 'many', 'indefinitely' things. This corresponds to plural in English (and in Slovenian and Arabic as well, AFAIK).

That exactly-two-grammatical-numbers approach seems rather 'language-centric' to me... :(

Aug 12 '22 18:08 MoreFamed

If I was to decide what headers to accept now, I'd go with the plural formats only. There's no real benefit in supporting Test Case in addition to Test Cases.

The main reason we have supported the singular form is that originally Robot's data was in HTML tables and they typically had headers like this:

Test Case	Keyword	Argument	Argument
Example	Log	Message	INFO

We have preserved singular forms for backwards compatibility reasons. Removing them would require long deprecation period and I'm honestly not sure would it be worth it. That said, we could easily make a decision that we only allow exactly one header with translations. English being "special" would be a bit odd, but we I guess that wouldn't be s major issue. Obviously we could also deprecation singular forms to make it less special.

(Been typing this on mobile and accidentally pushed the Close button on the small screen.)

Aug 12 '22 21:08 pekkaklarck

I like your idea: translation of header parts can have just one form (plural), and (original) English can have singular forms as well as a legacy (for now). I am not afraid of deprecating and removing English singulars much: I guess singulars are used much less then plurals (would not a short survey in the community help?), Robotidy can do a good job and RF and its users yet survived several limitations of its original almost boundless freedom in writing things (I guess *** Settings *** was allowed to be written as * S_E t_t_i_N G s * years ago) :)

Aug 13 '22 18:08 MoreFamed

Yeah, limiting header translations to just one form sounds like a good idea. What we do with English can be then be decided separately, but having a long term plan to support only the plural form sounds fine. At this point we could just announce that plan and start emitting actual deprecation warnings only in RF 7 or later. Do you @leeuwe have any comments related to this? What about @yanne?

Aug 13 '22 21:08 pekkaklarck

Reading the thread I've been educated by @MoreFamed and since you both reached an agreement, I have no further comments

Aug 13 '22 21:08 leeuwe

Ok, let's allow translating only the plural headers and deprecate singular headers in general. The former is covered by this issue and will be done by @yanne when he updates the class attributes of the Language class otherwise. The latter is covered by #4431 that I just submitted.

Aug 15 '22 11:08 pekkaklarck

Russian translation by @axxyhtrx added. It will be part of beta 1. Thanks for contribution!

Aug 15 '22 15:08 pekkaklarck

PR #4450 adds Bosnian provided by @Delilovic. It will be merged still before the beta that ought to be released today. Thanks for contribution!

Sep 02 '22 12:09 pekkaklarck

Turkish added by PR #4463. It will be included in beta 2. Thanks for a contribution @yusufcanb!

Sep 20 '22 18:09 pekkaklarck

Updates:

French true/false words added in PR #4487 by @mmalorni.
Spanish true/false words added in PR #4489 by Miguel Angel Apolayo Mendoza.
Simplified Chinese updated and Traditional Chinese added in PR #4488 by @nixuewei.

Updates will be part of rc 1. Thanks for contributions!

Sep 29 '22 10:09 pekkaklarck

PR #4499 brought us following translations:

Swedish by @JockeJarre
Bulgarian by @naschenez
Romanian by @zastress

Thanks you all for contributions! These will be included in RF 6.0 due to it being delayed a bit and we anyway needing another release candidate.

Oct 07 '22 22:10 pekkaklarck

Italian was added in PR #4502 by @lugi0 and will be included in rc 2. Thanks for contribution!

Oct 10 '22 13:10 pekkaklarck

PR #4506 added Hindi and it still makes it to RF 6.0. Thanks for contribution @bbpatel2001!

Oct 11 '22 12:10 pekkaklarck