jumanpp icon indicating copy to clipboard operation
jumanpp copied to clipboard

Write a Jumanpp version into comments when analyzing

Open kzinmr opened this issue 7 years ago • 2 comments

In Juman, SID header like # S-ID:5-5 is converted with its version like # S-ID:5-5 JUMAN:7.01. Jumanpp_v2 just passes it without the version like # S-ID:5-5. As a result, downstream KNP's header breaks like # S-ID:5-5 ���c� KNP:4.18-CF1.1 DATE:2018/01/30 SCORE:-11.52755.

kzinmr avatar Jan 30 '18 11:01 kzinmr

I avoid this issue by manually appending pseudo-version into every SID header of input files like # S-ID:5-5 JUMAN++:2.0.0 (ugly hack).

kzinmr avatar Jan 30 '18 11:01 kzinmr

This is a KNP bug I think...

Suggestion: add flag --version-comment=<off|append|prepend> with the default state being off. Otherwise we either append or prepend current version (what is the current version is more complex question).

If a comment of an input sentence was # S-ID:5-5, then we would output:

  • # S-ID:5-5 for off
  • # Jumanpp:2.0.0 S-ID:5-5 for prepend
  • # S-ID:5-5 Jumanpp:2.0.0 for append

eiennohito avatar Jan 31 '18 03:01 eiennohito