weblate icon indicating copy to clipboard operation
weblate copied to clipboard

Support for fluent terms not defined in the base language?

Open Imberflur opened this issue 1 year ago • 16 comments

Describe the problem

Terms are similar to regular messages but they can only be used as references in other messages. Their identifiers start with a single dash - like in the example above: -brand-name. The runtime cannot retrieve terms directly.

https://projectfluent.org/fluent/guide/terms.html

Because terms are referenced by other messages it can be useful for translators to define terms that aren't present in the base language. We have a few languages that utilize this pattern and noticed a few issues when adopting Weblate:

  1. No interface for translators to add novel terms for a particular language (afaict).
  2. Since terms are treated like regular messages, the cleanup add-on will remove terms not present in the base language.

Describe the solution you'd like

I'm not familiar enough with Weblate to know how viable it is to address 1, but having an interface for this would be pretty useful.

For 2, it seems possible to add a check to not remove terms in the cleanup code. A more ideal solution would be to check if the term is referenced by any other message, but I assume that would be much harder to implement and it could removed terms that are just briefly unused.

Describe alternatives you've considered

A workaround for this is to manually add empty placeholder terms to the base language when translators need them (other than adding more friction, this also is just less discoverable, but perhaps we could include information about this in our project specific documentation for translators).

Imberflur avatar Sep 08 '23 14:09 Imberflur

Presently, Weblate requires source language in monolingual translations to have all the strings. The reason behind this is that this is the only way you can define source string – the translation files do not have them.

nijel avatar Sep 11 '23 10:09 nijel

This issue has been automatically marked as stale because there wasn’t any recent activity.

It will be closed soon if no further action occurs.

Thank you for your contributions!

github-actions[bot] avatar Sep 26 '23 01:09 github-actions[bot]

Presently, Weblate requires source language in monolingual translations to have all the strings. The reason behind this is that this is the only way you can define source string – the translation files do not have them.

Can this limitation be lifted somehow? One of the solutions could be making a fake key with terms (for each language/component), where all terms will be listed as just string.

juliancoffee avatar Jan 08 '24 17:01 juliancoffee

This would break several assumptions we currently have on translations, so it's hard to see what all would have to be changed.

nijel avatar Jan 13 '24 05:01 nijel

This issue has been put aside. It is currently unclear if it will ever be implemented as it seems to cover too narrow of a use case or doesn't seem to fit into Weblate.

Please try to clarify the use case or consider proposing something more generic to make it useful to more users.

github-actions[bot] avatar Jan 13 '24 05:01 github-actions[bot]

@nijel I thought about this more, and I'm increasingly leaning toward creating default entries for terms per component.

In general, Fluent may define messages in three ways.

# just key
some-key = Message
# attributes to key
some-key-with-attributes = Message
   .attr1 = First Attribute
   .attr2 = Second Attribute
# terms, private to the component, used mainly for guaranteed "syncing" of terminology, usually different per translation
-some-private-term = Weblate Inc
company-name = Provided by { -some-private-term }

Keys are still the only part of public API. Attributes can only be accessed with keys, terms cannot be accessed at all. Currently, Weblate handles attributes as a plain string on the key. I'm not sure why we can't create a dummy key for terms during serialization/deserialization so that Weblate better conforms to Fluent spec.

If there are some assumptions this would break, can you please share them?

juliancoffee avatar Feb 02 '24 17:02 juliancoffee

If that's an acceptable solution, I could try making an MR to translate-toolkit

juliancoffee avatar Feb 05 '24 00:02 juliancoffee

Sorry, I don't know Fluent enough to decide this. @henry-torproject might be interested in discussing this (please let me know if I should stop bugging you with any Fluent related issues).

nijel avatar Feb 23 '24 12:02 nijel

Sorry, I don't know Fluent enough to decide this. @henry-torproject might be interested in discussing this (please let me know if I should stop bugging you with any Fluent related issues).

I still work with Fluent so I don't mind :)

@Imberflur why are your translators using Fluent Terms that are not being used in the original language? And would you expect to include Terms in the original (en-US?) language, that translators should not translate? I.e. why should some locales be using a Term that others should not?

In the Firefox context, the Terms are only used for brand and organisation names, so are always used in translations as well: https://searchfox.org/mozilla-central/search?q=%5E-.&path=.ftl&case=false&regexp=true. And I'm not aware of any deviations from this for all the other languages.

A more ideal solution would be to check if the term is referenced by any other message, but I assume that would be much harder to implement and it could removed terms that are just briefly unused.

The main problem with this is that the Term may be defined in one file and used in another, i.e. in another component or even outside of weblate. E.g. in Firefox, the "brand.ftl" file contains terms that are only used in many other .ftl files.

henry-torproject avatar Mar 04 '24 10:03 henry-torproject

Another note: the current Fluent references check assumes that translations should include the same Terms, so weblate will give an error if the translation is missing a Term or contains an additional Term.

So if you want to make changes, then the test would also need to be adjustable to make an exception for Terms. By default though, I think most projects would want the current behaviour (Term identifiers are shared by all locales, and must be used by all) so any change would need to be an opt-in option.

henry-torproject avatar Mar 04 '24 10:03 henry-torproject

why are your translators using Fluent Terms that are not being used in the original language? And would you expect to include Terms in the original (en-US?) language, that translators should not translate? I.e. why should some locales be using a Term that others should not?

Well, that's the open question in an open-source project, but do you recommend just avoiding using terms outside of use-case with brand names?

juliancoffee avatar Mar 04 '24 18:03 juliancoffee

@henry-torproject

juliancoffee avatar Mar 04 '24 18:03 juliancoffee

Well, that's the open question in an open-source project, but do you recommend just avoiding using terms outside of use-case with brand names?

@juliancoffee I've just not considered the possibility of using Terms for anything else, or a use case where one locale would use a Term that the other would not. But I've only used them in the Firefox context.

Do you perhaps have some examples of why your translators have found them useful in the past? It may be that some existing Weblate feature already helps, such as glossaries.

Also, which platform were you using before that supported this feature?

The main reason to avoid locale-specific Terms is:

  • Weblate does not support them, and I would guess that it would be difficult to enable it.
  • Terms need to have unique IDs across all ".ftl" files loaded in scope (if you are using Fluent DOM or similar). As developers, you can ensure all your Fluent IDs are unqiue. But if you allow different translators to set the IDs you run the risk of two Terms clashing.
  • It makes it harder to compare strings between different locales. E.g. the Fluent references check is less effective.

henry-torproject avatar Mar 05 '24 10:03 henry-torproject

@henry-torproject I checked our current uses of it and it seems that translators use it to avoid duplication, like that.

-hud-skill-sc_wardaura_title = Aura del guardián
hud-skill-sc_wardaura_unlock_title = Desbloquear {{ -hud-skill-sc_wardaura_title }}
hud-skill-sc_wardaura_unlock = Emana de ti un aura que te protege a ti y a tus aliados{ $SP }
hud-skill-sc_wardaura_strength_title = Potencia de {{ -hud-skill-sc_wardaura_title }}
hud-skill-sc_wardaura_strength = La potencia de la protección aumenta en un { $boost } %{ $SP }
hud-skill-sc_wardaura_duration_title = Duración de {{ -hud-skill-sc_wardaura_title }}
hud-skill-sc_wardaura_duration = Los efectos de la protección duran un { $boost } % más{ $SP }
hud-skill-sc_wardaura_range_title = Alcance de {{ -hud-skill-sc_wardaura_title }}

This pattern may or may not work in other languages, so it won't make a lot of sense to provide a string for it, nor it's desirable to make translators try to reproduce it in other languages, because forcing concatenation from the programming side is a clear anti-pattern, but I'm not sure how hard should we limit translators if they want to play into de-duplication.

For the time being, we allow contributions from both Weblate and just file edits in Merge Requests, because Weblate as of now lacks some important quality-of-life features as Zen mode for suggestions and being able to download/upload suggestions.

P. S. The same part from English for reference

hud-skill-sc_wardaura_unlock_title = Warding Aura Unlock
hud-skill-sc_wardaura_unlock = Allows you to ward your allies against enemy attacks.{ $SP }
hud-skill-sc_wardaura_strength_title = Strength
hud-skill-sc_wardaura_strength = The strength of your protection increases by { $boost } %.{ $SP }
hud-skill-sc_wardaura_duration_title = Duration
hud-skill-sc_wardaura_duration = The effects of your ward last { $boost } % longer.{ $SP }
hud-skill-sc_wardaura_range_title = Radius

juliancoffee avatar Mar 08 '24 14:03 juliancoffee

@juliancoffee Thanks for providing the context, and the English reference helps.

Without seeing the English, I would think that is OK use of a Term for a feature or product name, but also something that all locales should be using. Note, I'm assuming that "Warding Aura" is the proper noun for some kind of product or feature.

However, it is not clear to me why the translation is including the "feature name" when the original does not. E.g. "Strength" gets translated to "Strength of Warding Aura", and "Duration" gets translated to "Duration of Warding Aura". I don't know Spanish, so maybe there is something I'm missing. But it seems the translation has different UI design goals to English: one is verbose whilst the other is using previous context to be able to abbreviate. If the translators think the abbreviation is not clear enough, maybe that is something that might need to change in the original English as well.

I would also say that if this is only used four times, a Fluent Term isn't really necessary. Unless the same feature name appears in other places, or you want to allow forks to re-brand it to a new name by swapping out the Term.

So I would personally suggest the following. If Spanish is the only locale that is using this Term for the feature name, then talk to your Spanish translators and ask why they are the exception. If there is a good reason, then just ask them to just repeat the words without using Terms. If other locales are also using Terms for this feature, then include a Term in the English file, and use it for the hud-skill-sc_wardaura_unlock_title, and consider whether it should also be used for the other strings so the UI designs match.

Also, a small note: the double {{ is not necessary, a single { is sufficient.

henry-torproject avatar Mar 11 '24 09:03 henry-torproject

Thanks, I think we'll adjust our guidelines.

I'm not sure if we should close the issue, maybe someone will come up with their own use case, but our case is probably resolved. (although I guess the bot might close the issue anyway?)

juliancoffee avatar Mar 11 '24 10:03 juliancoffee