poedit icon indicating copy to clipboard operation
poedit copied to clipboard

Check if format specifiers are exactly the same

Open JensMertelmeyer opened this issue 7 years ago • 2 comments

If the format identifiers get messed up in a translation, Poedit can highlight the error. This is a brilliant feature!

Example

#. Programmer's name for it: minuteThreshholdFmt
#, c-format
msgid "Less than %d minutes ago"
msgstr "Vor weniger als d Minuten"

This will get highlighted in Poedit since the translator removed the % sign, thinking it was a good idea. So far, all is well.

Involving format specifiers

This check even considers format specifier flags: So something like "Current speed %.1f km/h" can be translated to "Geschwindigkeit: %.0f km/h". The translator removed the digit, but it is still a valid format string, and Poedit is aware of that. I would like to suggest adding a "strict" option so this also gets reported because the identifiers are not exactly the same anymore.

Why would you want this

While some floating point digits don't seem like a big deal, changing an identifier from %s to % s (mind the blank space) is also an option (according to the printf documentation). The printf method handles this well, but I learnt the hard way that other libraries do not support blank spaces for %s and others.

Of course Poedit is not to blame for this. I'm just trying to point out that a "strict check" option can be helpful. Many thanks for reading.

Example file

test_fmt.zip

JensMertelmeyer avatar Aug 09 '17 17:08 JensMertelmeyer

I think you are conflating two severity levels of the issue:

This check even considers format specifier flags: So something like "Current speed %.1f km/h" can be translated to "Geschwindigkeit: %.0f km/h". The translator removed the digit, but it is still a valid format string, and Poedit is aware of that. I would like to suggest adding a "strict" option so this also gets reported because the identifiers are not exactly the same anymore.

I'm not sure this is a good idea, there may be reasons for such translation changes. But a warning (not a hard error or "string mode", but a warning, unlike actual mismatches, which are errors) may be appropriate. Do note however that it is not that simple, some changes are always valid and commonly used (arguments reordering).

As always, PRs to that effect are welcome.

While some floating point digits don't seem like a big deal, changing an identifier from %s to % s (mind the blank space) is also an option (according to the printf documentation). The printf method handles this well, but I learnt the hard way that other libraries do not support blank spaces for %s and others.

Please be specific: what languages/libraries do not handle it, but Poedit does accept % s? As far as I know, msgfmt is pretty knowledgable about this and does error out for languages where this isn't valid.

Regardless, this is a much clearer case: if the source string does not use % and the translation does, it is near-certain that it was a mistake and absolutely should be reported.

vslavik avatar Aug 10 '17 09:08 vslavik

Thanks for taking the time to reply.

I agree there might be reasons for "adjusting" the specifiers, and it's a good thing it's allowed. I also think it's a valid use case to know of these changes and to double-check them. Certainly a hard task, considering all the different format types (C, Java, Object Pascal, ...).

Which brings me to the % s example: It turns out the tool we were using had incorrectly tagged all the items with #, c-format where it should have been #, object-pascal-format as we were using a Delphi implementation. I was mistaken thinking they were compatible. I wasn't even aware gettext supports so many different formats.

Sorry for bringing this example up, it was nonsense.

JensMertelmeyer avatar Aug 10 '17 20:08 JensMertelmeyer