slug-generator icon indicating copy to clipboard operation
slug-generator copied to clipboard

Wrong transforms for Cyrillic letters

Open serggi opened this issue 6 years ago • 2 comments

Hi, thanks for a bundle. Can you help me understand why Cyrillic (I've checked ukrainian ans russian) letters don't transform like here https://github.com/unicode-org/cldr/blob/master/common/transforms/Ukrainian-Latin-BGN.xml $slugGenerator->generate('щ', ['locale' => 'uk']) gives me s instead of shch

########################################################################
#
# BGN Page 94 Rule 3.6
#
# шч becomes sh·ch
#
########################################################################
#
ШЧ → SH·CH ; # CYRILLIC CAPITAL LETTER SHA
Шч → Sh·ch ; # CYRILLIC CAPITAL LETTER SHA
шч → sh·ch ; # CYRILLIC SMALL LETTER SHA
Ш} $lower → Sh ; # CYRILLIC CAPITAL LETTER SHA
Ш → SH ; # CYRILLIC CAPITAL LETTER SHA
ш → sh ; # CYRILLIC SMALL LETTER SHA
Щ} $lower → Shch ; # CYRILLIC CAPITAL LETTER SHCHA
Щ → SHCH ; # CYRILLIC CAPITAL LETTER SHCHA
щ → shch ; # CYRILLIC SMALL LETTER SHCHA

serggi avatar Sep 19 '19 12:09 serggi

Hi @serggi, congrats to your first GitHub issue 🎉🚀

It looks like the SlugGenerator doesn’t find the correct transform for the uk locale. It uses the rule uk-Latn instead of uk-uk_Latn/BGN.

Probably we should add

$locale.'-'.$locale.'_'.$rule,
\Locale::getPrimaryLanguage($locale).'-'.\Locale::getPrimaryLanguage($locale).'_'.$rule,

to https://github.com/ausi/slug-generator/blob/87a661ab766fa5bca78629fc223307a66e61c605/src/SlugGenerator.php#L287-L289

Or it might be better to handle all script rules like Latn correctly by parsing the locale and setting the script via Intl’s Locale class.

I’m not sure if we can use the /BGN variant for all languages, we have to test the impact of adding it.

Until this issue is fixed by the library itself you can use the preTransform option in your project like so:

$slugGenerator->generate('щ', ['locale' => 'uk', 'preTransforms' => ['uk-uk_Latn/BGN']]);

ausi avatar Sep 19 '19 21:09 ausi

Thanks, Martin for your help and your advice. It works well.

serggi avatar Sep 20 '19 05:09 serggi