pylance-release icon indicating copy to clipboard operation
pylance-release copied to clipboard

Consider adding an option to disable localized translations of technical error messages or terms

Open ynyyn opened this issue 1 year ago • 5 comments

Issue with Current Localization Translations The current localization translations of error messages, especially those containing technical terms, may sometimes be ambiguous or misleading, causing confusion for users who are more comfortable with English technical terms.

Previous Discussion A discussion in a closed issue highlights the concerns of multiple users regarding this matter, which can be found here: https://github.com/microsoft/pylance-release/issues/4579


As an external user, I appreciate the official efforts to localize these error messages, as some users may indeed need them. However, objectively speaking, the current localization translations, especially those containing technical terms, may sometimes be ambiguous or misleading and need further refinement.

It's worth noting that related plugins (pyright, which pylance is powered by) no longer accept external PR contributions to improve localization translations (source: https://github.com/microsoft/pyright/pull/8659#issuecomment-2269643727). Adjustments to localization translations need to be relayed by the official development team to the localization team, following an internal workflow for evaluation and implementation. This is undoubtedly a long-term process, requiring significant resources to gradually improve to a level that satisfies many users.


Proposed Solution I hope that we, as "experienced" users, can have an option to separately disable the localization translations of these technical error messages and terms and instead display them in English (en) locale.

This would, to some extent, help users who are more comfortable with English technical terms better understand these error messages, especially when there are ambiguities in translation, without having to switch the entire VSCode localization locale.

Alternative Approach An alternative solution could be to improve the localization translations, but this is a long-term process and may not be feasible in the short term.

Any suggestions or thoughts will be appreciated.

ynyyn avatar Aug 15 '24 08:08 ynyyn

Thanks for your thoughts on this! This is timely because I'm currently working on a related issue -- #6035.

I hope that we, as "experienced" users, can have an option to separately disable the localization translations of these technical error messages and terms and instead display them in English (en) locale.

I think you're saying that you'd like a way to have our diagnostics appear in English while having the strings in our UI (command names, code action names, etc) use the VS Code display language? Is that correct? Or are there other strings beyond the diagnostics that you'd want to see in English?

The current localization translations of error messages, especially those containing technical terms, may sometimes be ambiguous or misleading, causing confusion for users who are more comfortable with English technical terms.

Yes, I agree. Can you provide some specific examples that you've noticed?

For #6035, I'm currently pursuing the "Anternative Approach" that you mentioned above. I'm updating our localization process to allow us to provide comments (contextual info) along with each English string that gets localized. There are two high-level scenarios that I'm trying to address there:

  1. Words that should always be in English -- A simple example is "expectedBoolLiteral" where the English string is "Expected True or False". True and False in that string are Python keywords, but in some languages our loc team has translated them. For example, in French: "Attendu vrai ou faux". We can provide a comment that locks the words "True" and "False" to English, requiring all translations to include them unchanged.
  2. Concepts that the loc team may not understand -- For example, "expectedComplexNumberLiteral" where the English string is "Expected complex number literal for pattern matching". We can provide a comment explaining that "complex" here is referring to the mathematical concept of a real number plus an imaginary number, not a number that is complicated.

Unfortunately, there are a number of words that are not technically keywords, type names, etc. and therefore shouldn't be locked to English, but are either extremely hard to translate or are nuanced in some way that make them easier to understand in English. For example:

  • "A frozen class cannot inherit from a class that is not frozen" -- The word "frozen" here is both a concept (freezing a dataclass to prevent changes) and an implicit reference to the frozen parameter on @dataclass.
  • "No overloaded function matches type {type}" -- This is similar in that "overloaded" is conceptual, but in English it has an implicit connection to the @overload decorator.
  • "'yield' not allowed inside a comprehension" -- I believe "comprehension" is a term that the Python community invented/repurposed in a non-standard way. Simply translating it to the target language's equivalent of the word "comprehension" won't help at all.

For experienced users it might make sense not to localize the words "frozen", "overloaded", and "comprehension" above (i.e. lock them to English), but for less experienced users or users with no English background, perhaps not. In cases like these, we're planning to provide the loc team with a comment that gives them more context on how the word is being used, and let each language make their own decision on how to handle them.

For users with a deeper understanding of the Python typing system, I think you're correct that our translations will never fully satisfy non-English speakers and there's probably a "tipping point" where a user gains enough understanding of the Python typing system that the English strings will be best for them.

It's worth noting that related plugins (pyright, which pylance is powered by) no longer accept external PR contributions to improve localization translations (source: https://github.com/microsoft/pyright/pull/8659#issuecomment-2269643727).

The phrase "no longer accept" stood out to me here. For what it's worth, this isn't new. We were never able to accept external contributions to our non-English strings. If Pyright has accepted changes of this type in the past, it was either because the person who merged the PR didn't understand that those new strings would be overwritten the next time we synced with the loc team's database, or because they filed a bug against the loc team to update the strings after accepting the PR.

debonte avatar Aug 15 '24 16:08 debonte

@debonte, I wonder if we should consider adding more metadata or formatting hints in the English strings to help the translators understand which words should remain untranslated. In some cases, these terms are already presented in double quotes. For example: "awaitNotAllowed": "Type annotations cannot use \"await\"",. In other cases, such as the "expectedBoolLiteral" example you mention above, keywords like True and False are not quoted. Adding quotes in this case would probably make the error message less readable, but we could use some other delimiter that is stripped out of the English string before it is presented to users. Maybe something like double double asterisks?

erictraut avatar Aug 15 '24 16:08 erictraut

@erictraut, there's already an established mechanism for this, which we will leverage. The software that the loc team uses to enter translations recognizes certain directives within the loc comments. The most common one is Locked which tells the software that the "locked" substrings must appear in the translated string and will not allow translations that violate that rule.

"expectedBoolLiteral": {
    "message": "Expected True or False",
    "comment": "{Locked='True';'False'}"
},

debonte avatar Aug 15 '24 16:08 debonte

Ah, nice! Yes, if there's an established convention, it makes sense to adopt that.

erictraut avatar Aug 15 '24 16:08 erictraut

I had the same thought and I found this issue.

I think you're saying that you'd like a way to have our diagnostics appear in English while having the strings in our UI (command names, code action names, etc) use the VS Code display language? Is that correct?

Yes, that seems correct. I would like to have this option and see the diagnostics in English, while the UI (or other apps) use wonderful translation.

kubotty avatar Sep 11 '24 09:09 kubotty

This issue has been fixed in prerelease version 2024.10.102, which we've just released. You can find the changelog here: CHANGELOG.md

heejaechang avatar Oct 24 '24 21:10 heejaechang