open-location-code icon indicating copy to clipboard operation
open-location-code copied to clipboard

Non-Roman Scripts

Open LemmaEOF opened this issue 6 years ago • 6 comments

Currently, Plus Codes only work with roman characters A-Z and arabic numerals 0-9. While this isn't an issue in any countries that use roman-scripted languages, it can be incredibly alienating to anyone whose primary or only languages use non-roman scripts. This can be a huge barrier to understanding and ease of use if users don't know or understand Roman script. What should Plus Codes do for non-Roman scripts? Should they just have character substitution or have a different system altogether?

LemmaEOF avatar Mar 13 '18 20:03 LemmaEOF

Hi @Boundarybreaker . Yeah, we thought a lot about this. We talked to a bunch of people from countries with other scripts, and they pointed out they used A-Z0-9 to enter URLs so didn't perceive it as an issue.

A-Z0-9 is either the first or second choice of input characters for most people, and we wanted, initially at least, to have a single representation (otherwise you end up navigating to 9GHJ+P8, seeing a sign with 9ГХЙ+П8, and wondering if you got to the right place or not).

At the moment we don't have any evidence that it's a significant barrier, and we've chosen an initial set of countries that include other scripts to launch in (RU, IN, SO) so that we can monitor the situation and get feedback.

Open to suggestions though - I'll leave this issue open in case anyone else would like to comment.

drinckes avatar Mar 14 '18 11:03 drinckes

@drinckes Just as an observation, seeing the '9' and '8' characters in the correct place of a potential different-script plus code makes it immediately more trustworthy to me.

I agree that it might be best to use just one character set - but if using localized variants is deemed necessary at some point in the future, mapping individual characters to similar looking ones (like 'M' to 'П', 'N' to 'Й', 'X' to 'Х') might be a good idea.

bocops avatar Mar 14 '18 11:03 bocops

@bocops ha so we've already been through a cycle of thinking about cyrillic (several people on the team are familiar with it).

One problem is that looking similar != sounding similar. Cyrillic Р looks like Roman P but is pronounced "er", and that raises the question of do you use it in place of P? Or R? Or neither?

If we can only use symbols that are both visually distinct, and sound different to the roman characters, we may end up without a lot of choices in some scripts.

drinckes avatar Mar 14 '18 13:03 drinckes

Right - another problem with that will be that choosing either similar looking or similar sounding characters likely will lead to an unsorted character set. I know that GX00+ is adjacent to HX00+, because 'G' and 'H' are consecutive letters of the latin alphabet. The same will probably not be the case with their replacement characters, however chosen.

bocops avatar Mar 14 '18 15:03 bocops

Recommending to close this issue as out of scope.

I'm considering that breaking changes of this magnitude are out of scope.

fulldecent avatar Jun 04 '21 18:06 fulldecent

There hasn't been any meaningful discussion of this issue in over four years. Meanwhile, the FAQ states that

Q: Why is Open Location Code based on latin characters?

A: We are aware that many of the countries where Open Location Codes will be most useful use non-Latin character sets, such as Arabic, Chinese, Cyrillic, Thai, Vietnamese, etc. We selected Latin characters as the most common second-choice character set in these locations. We considered defining alternative Open Location Code alphabets in each character set, but this would result in codes that would be unusable to visitors to that region.

which means that this actually seems resolved as "feature not planned" by now. My suggestion would be to also add the above to the FAQ in the Wiki, then close this issue.

bocops avatar Oct 06 '22 16:10 bocops