apertium icon indicating copy to clipboard operation
apertium copied to clipboard

Bug in trimming?

Open hectoralos opened this issue 4 years ago • 2 comments

I get this:

$ echo "jouer du violoncelle" | apertium -d . fra-oci
@jouer lo violoncèl

To analyse the issue, I do:

$ echo "jouer du violoncelle" | apertium -d . fra-oci-disam
"<jouer du>"
	"jouer# de" vblex inf SELECT:1514
		"le" det def m sg
;	"jouer" vblex inf SELECT:1514
;		"de" pr
;			"le" det def m sg
"<violoncelle>"
	"violoncelle" n m sg
"<.>"
	"." sent

The problem is that I don't have jouer# de in the bidix, although it exists in apertium-fra. Instead in the bidix there is jouer# de la flûte and jouer# de la trompette:

$ grep "r>jouer<" .deps/fra-oci.dix
<e r="LR"><p><l>flaütar<s n="vblex"/></l>         <r>jouer<g><b/>de<b/>la<b/>flûte</g><s n="vblex"/></r></p></e>
<e r="LR"><p><l>flaütejar<s n="vblex"/></l>       <r>jouer<g><b/>de<b/>la<b/>flûte</g><s n="vblex"/></r></p></e>
<e>       <p><l>jogar<s n="vblex"/></l>           <r>jouer<s n="vblex"/></r></p><par n="v-v_tv"/></e>
<e r="LR"><p><l>jugar<s n="vblex"/></l>           <r>jouer<s n="vblex"/></r></p><par n="v-v_tv"/></e>
<e>       <p><l>protagonizar<s n="vblex"/></l>    <r>jouer<g><b/>le<b/>rôle<b/>principal</g><s n="vblex"/></r></p></e>
<e a="hector"><p><l>trompetar<s n="vblex"/></l><r>jouer<g><b/>de<b/>la<b/>trompette</g><s n="vblex"/></r></p></e>
<e a="hector"><p><l>tamborinar<s n="vblex"/></l><r>jouer<g><b/>du<b/>tambour</g><s n="vblex"/></r></p></e>

I suspect that the trimming messes up and includes jouer# de, which causes the error.

hectoralos avatar Feb 15 '22 20:02 hectoralos

Since you're going in the fra->oci direction, wouldn't trimming be about whether jouer was present on the left side of the bidix?

mr-martian avatar Feb 15 '22 20:02 mr-martian

You are probably right. When I saw @ in the output I automatically associated to a trimming problem, since @ errors disappeared after trimming was included. The fact is that I get an unusual error, and a morphological analysis with a word that is not in my bidix. jouer# de should not appear at all in the analysis, as any word in the monodix should not appear if it is not used in the bidix.

hectoralos avatar Feb 16 '22 03:02 hectoralos