Write a Jumanpp version into comments when analyzing
In Juman, SID header like # S-ID:5-5 is converted with its version like # S-ID:5-5 JUMAN:7.01.
Jumanpp_v2 just passes it without the version like # S-ID:5-5.
As a result, downstream KNP's header breaks like # S-ID:5-5 ���c� KNP:4.18-CF1.1 DATE:2018/01/30 SCORE:-11.52755.
I avoid this issue by manually appending pseudo-version into every SID header of input files like # S-ID:5-5 JUMAN++:2.0.0 (ugly hack).
This is a KNP bug I think...
Suggestion: add flag --version-comment=<off|append|prepend> with the default state being off.
Otherwise we either append or prepend current version (what is the current version is more complex question).
If a comment of an input sentence was # S-ID:5-5, then we would output:
-
# S-ID:5-5for off -
# Jumanpp:2.0.0 S-ID:5-5for prepend -
# S-ID:5-5 Jumanpp:2.0.0for append