unidecode icon indicating copy to clipboard operation
unidecode copied to clipboard

Option to avoid transliterating punctuation marks as regular letters

Open eyaler opened this issue 3 years ago • 1 comments

Always transliterating punctuation marks as regular letters could be an issue for some applications. While the paragraph sign ¶ is transliterated to P, I would like to have an option to treat it as unknown.

(I started this issue following https://github.com/avian2/unidecode/commit/81f938d9419f4b651a089a0d809bd1a0566b1329 and regarding the exotic inverted nun ׆‎ that was changed to be transliterated into n as the regular nun נ. but the inverted one is an editorial/punctuation mark)

eyaler avatar Jul 22 '21 01:07 eyaler

Thank you for the suggestion, but I am not going to implement this in Unidecode. I want to keep Unidecode a simple function with no configuration. The reason is similar to why I don't want to have language configuration in this library. I don't have time or knowledge to maintain the additional complexity. There are other transliteration libraries (unihandecode, for example) that are more configurable and might accept of your proposal.

avian2 avatar Aug 02 '21 17:08 avian2