ichiran icon indicating copy to clipboard operation
ichiran copied to clipboard

Improving expression detection

Open Heliozoa opened this issue 1 year ago • 1 comments

Hi, I noticed that ichiran will currently segment "どこから見ても" as three separate words "どこ", "から" and "見ても", rather than the expression which has a JMdict entry. Same is true for some other expressions as well, such as 負の遺産 or 取り留めも無い. Other expressions like どう見ても do get detected as such. I don't know how difficult it would be, but it would be great if ichiran was able to detect these expressions more reliably.

Heliozoa avatar Aug 02 '23 09:08 Heliozoa

Related? If I search Ichiran/ichi.moe for 写真を撮る it finds the literal JMdict entry (similarly to どこから見ても from above) but 写真を撮りました doesn't recognize that this is a conjugation of the longer entry?

fasiha avatar Feb 01 '24 04:02 fasiha