pontoon icon indicating copy to clipboard operation
pontoon copied to clipboard

Pontoon does not correctly differentiate between Turkish dotted and dotless "i"

Open harmitgoswami opened this issue 1 year ago • 1 comments

Currently, Pontoon doesn't differentiate between the Turkish 'ı' and 'i' (capital I and İ respectively), despite these being different characters.

For example, these two queries produce the exact same results (in addition to incorrect highlighting):

https://pontoon.mozilla.org/tr/firefox/browser/browser/browser.ftl/?search=%C4%B1&string=246376 https://pontoon.mozilla.org/tr/firefox/browser/browser/browser.ftl/?search=i&string=246376

This bug has been bought up and addressed before: https://bugzilla.mozilla.org/show_bug.cgi?id=1346180

harmitgoswami avatar Sep 11 '24 01:09 harmitgoswami

It seems after some research that database collation is the correct and recommended way to go: http://www.i18nguy.com/unicode/turkish-i18n.html

However, even after reverting to our previous approach, I can confirm that Pontoon still doesn't detect the difference between the 'i' and 'ı' characters.

Collation in Django does seem to be supported, but the way we invoke entities.filter and entities.order_by makes me think we'd need a pretty large refactor to properly use Django's Collate function.

harmitgoswami avatar Sep 27 '24 16:09 harmitgoswami

Hello here,

I've been checking the comparison between the two characters, I got different results while searching for the two chars in the same string.

Here is the comparison:

For 'i' char::

Image

For 'ı' char:

Image

Seems the issue has been addressed ?

RafaelJohn9 avatar Apr 29 '25 10:04 RafaelJohn9

Nice catch - this seems to be fixed, indeed.

mathjazz avatar May 02 '25 22:05 mathjazz