ichiran icon indicating copy to clipboard operation
ichiran copied to clipboard

Improving detection of katakana "compound words"

Open Heliozoa opened this issue 4 months ago • 0 comments

As always, thank you for your work on ichiran!

I noticed that some some katakana words that consist of two other katakana words are detected as two individual words rather than the single "compound word". Some examples:

コンビニエンスストア picked up as コンビニエンス followed by ストア デジタルカメラ => デジタル + カメラ コンピューターゲーム => コンピューター + ゲーム サウンドトラック => サウンド + トラック

Other compound words such as ビデオカメラ are detected as one word. Although in most cases the meaning is still clear, it would be very nice if ichiran could detect more of these compound words.

Heliozoa avatar Sep 27 '24 22:09 Heliozoa