unidecoder icon indicating copy to clipboard operation
unidecoder copied to clipboard

FYI: Text Transliteration in stringi

Open gagolews opened this issue 11 years ago • 2 comments

Hi @rich-iannone, Just for your information: I've added an interface for ICU's Transliterator in stringi yesterday, see this issue....

Some examples:

> stri_trans_general("zażółć gęślą jaźń", "Latin-ASCII") # Polish text
[1] "zazolc gesla jazn"
> stri_trans_general("„groß”©", "Latin-ASCII")
[1] ",,gross\"(C)"
> stri_trans_general("stringi", "Latin-Greek")
[1] "στριγγι"
> stri_trans_general("stringi", "Latin-Cyrillic")
[1] "стринги"

gagolews avatar Apr 20 '14 09:04 gagolews

There's also some capability in base R:

td <- readLines(curl::curl("https://raw.githubusercontent.com/rich-iannone/UnidecodeR/master/inst/examples/Totentanz__de.txt"))
iconv(td, to = "ASCII//translit")

hadley avatar Apr 17 '15 20:04 hadley

But as far as I can tell, the transliteration feature of iconv leads to different outputs on different platforms, which is an issue in many cases.

Bisaloo avatar Aug 15 '18 10:08 Bisaloo