transliteration icon indicating copy to clipboard operation
transliteration copied to clipboard

Incorrect transliteration for Japanese language

Open AlexMisiulia opened this issue 4 years ago • 3 comments

Hi, thank you for your library!

I tried to transliterate this Japanese phrase: 今日の天気は良く見えません。

It should be (with the help of google translate): Kyō no tenki wa yoku miemasen.

Actually it is: Jin Rino Tian Qiha Liangku Jianemasen.

I am not good at Japanese at all. But maybe you can help to find some workaround or suggest another library that works well with Japanese. Thanks!

AlexMisiulia avatar Jun 17 '20 10:06 AlexMisiulia

Hi, thanks for using this liberary. As you mentioned, this module doesn't work well with Japanese. It's due to the fact that Japanese and Chinese share a lot of characters (Kanji vs. Hanzi) so it is not able to differenciate the two languages. A lot of Japanese characters are transliterated as Chinese characters. Another issue is that for each Japanese Kanji, they can be transliterated into different Romaji (Roman characters) in different sentences. So without doing a grammatical analysis, there's no way to accurately transliterate Japanese. You may try something like kuroshiro module instead. https://kuroshiro.org/#demo (select "To: Romaji" and "Mode: Spaced")

yf-hk avatar Jun 18 '20 01:06 yf-hk

@dzcpy Could this be solved by explicitly passing the locale of the source text? transliterate('今日の天気は良く見えません。', { locale: 'ja' })

milesj avatar Aug 11 '20 23:08 milesj

@milesj Yes, that's one solution. However it doesn't solve the polyphone issue with Japanese

yf-hk avatar Aug 11 '20 23:08 yf-hk