cg3 icon indicating copy to clipboard operation
cg3 copied to clipboard

cg-mwesplit adds extra newline

Open snomos opened this issue 7 months ago • 2 comments

Cf the following (using giellalt/lang-sme as example):

echo 'Jođiheaddji guovttosges' | hfst-tokenise -g tokeniser-gramcheck-gt-desc.pmhfst 
"<Jođiheaddji guovttosges>"
	"ges" Pcle Foc/ges <W:0.0> "<ges>"
		"jođiheaddji guovttos" N Coll Sem/Group_Hum Sg Loc <W:0.0> "<Jođiheaddji guovttos>"
	"ges" Pcle Foc/ges <W:0.0> "<ges>"
		"jođiheaddji guovttos" N Coll Sem/Group_Hum Sg Nom <W:0.0> "<Jođiheaddji guovttos>"
:\n
'Jođiheaddji guovttosges' | hfst-tokenise -g tokeniser-gramcheck-gt-desc.pmhfst | cg-mwesplit 
"<Jođiheaddji guovttos>"
	"jođiheaddji guovttos" N Coll Sem/Group_Hum Sg Loc <W:0.0>
	"jođiheaddji guovttos" N Coll Sem/Group_Hum Sg Nom <W:0.0>
"<ges>"
	"ges" Pcle Foc/ges <W:0.0>
:\n

After cg-mwesplit has been applied, there is an extra newline after the split cohorts that was not there in the input. Do you get the same, @unhammer ?

snomos avatar Nov 17 '23 08:11 snomos