getlang icon indicating copy to clipboard operation
getlang copied to clipboard

czech language support, introduced dynamic weight for unicode blocks

Open zdebra opened this issue 6 years ago • 2 comments

Hello, I've added support for a czech language as well as support for dynamic weight for scripts (scriptCountFactor).

The reason for that is that in a czech alphabet, there are few characters like ř, š, ů which you won't find in any other language.

I hope you'll like my take on that.

zdebra avatar Jan 06 '19 10:01 zdebra

Thank you for your contribution here. But I think it would fit in better with the pre-existing design to add a Czech line in the 'profiles.go' file. At least then we wouldn't have to hard-code unicode values like that.

rylans avatar Feb 13 '19 22:02 rylans

The hardcoding of unicode values is optional. Predefined values from the unicode package are supported. I don't think your approach with profiles works for all languages. Czech is a part of Slavic languages family which are very similar to each other when comparing words syllables. I wasn't able to make the tool work while there is Serbian (Slavic language) already present. I'll be happy to see it's possible, but it feels unlikely.

I came with a solution that is based on a fact that characters like ř, š, ů are unique for the Czech language.

zdebra avatar Feb 26 '19 10:02 zdebra