dataverse icon indicating copy to clipboard operation
dataverse copied to clipboard

Properties translation - Some single quotes are removed from the UI

Open bappun opened this issue 3 years ago • 4 comments

What steps does it take to reproduce the issue? Use single quotes (') in the values for some properties.

  • When does this issue occur? Only for some keys in the property files. Maybe for translations parsed as HTML?

  • Which page(s) does it occurs on? Observed on the login page but it is not the only case.

  • What happens? Quotes in the property files can disappear in the UI. For example, in the french Bundle.properties we have this: login.signup.blurb=<a href="{0}">S'inscrire pour obtenir un compte Dataverse</a>. However the word S'inscrire becomes Sinscrire in the UI (see screenshot).

  • To whom does it occur (all users, curators, superusers)? All users

  • What did you expect to happen? The quote should not disappear. It seems that this issue is fixed by using double single quotes instead ('') in the properties involved, as observed in some en_US files. It might be a known issue but it makes it hard to work on the translations since we are not sure when to use double single quotes or single quotes (is there a list somewhere for that?). When working on the translations it is also hard to know if it is a typo or intentional.

Which version of Dataverse are you using? 5.10

Any related open or closed issues to this bug report? No

Screenshots:

  • On the login page: image

bappun avatar Mar 29 '22 09:03 bappun

@bappun I don't have much to add except that I've been baffled about this over the years as well. If the single quote doesn't show up, I add a second one, like you said.

I do think we should better understand and document this behavior in our guides.

pdurbin avatar Mar 29 '22 14:03 pdurbin

@bappun are you interested in making a pull request to improve the guides?

pdurbin avatar Oct 01 '22 15:10 pdurbin

Does anyone have an idea where the issue could come from? We have not found anything while working on the translation.

Since it was quite tedious to test each property to see if we should put a single quote or two single quotes, we decided on a workaround using a unicode character similar to a single quote but this might be really bad for accessibility.

bappun avatar Oct 19 '22 12:10 bappun

FWIW: My guess is that strings read directly from the xhtml files require the single quote to be escaped whereas ones that we process in Java somewhere may not be. That may also be related to whether there's a token in the string to be replaced, i.e. if there's no token the string may be used directly whereas with a token, when used via a #{bundle['login.signup.blurb']} in xhtml, it runs through a formatter that expects an escaped double single quote. Unfortunately, I'm not sure that gives you an easy way to determine which way to go with a translation. Perhaps using double single quote in any with a token to be replaced would be a good rule-of-thumb, but there could still be exceptions to that.

qqmyers avatar Oct 19 '22 13:10 qqmyers

FWIW: In the MessageFormatter javadocs: The rules for using quotes within message format patterns unfortunately have shown to be somewhat confusing. In particular, it isn't always obvious to localizers whether single quotes need to be doubled or not. Make sure to inform localizers about the rules, and tell them (for example, by using comments in resource bundle source files) which strings will be processed by MessageFormat. Note that localizers may need to use single quotes in translated strings where the original version doesn't have them.

Short of just indicating line-by-line whether MessageFormatter is used - perhaps our code for the other cases could also decode '' and '{...' entries so the Bundle could always have encoded (with double single quote) entries.

qqmyers avatar Oct 20 '22 17:10 qqmyers