languagetool icon indicating copy to clipboard operation
languagetool copied to clipboard

[fr] Incorrect grammar analysis for some nouns incorrectly identified as verbs

Open Sharcoux opened this issue 11 months ago • 5 comments

In the sentence: Je veux mieux faire travailler les équipes de développement et de production, "équipes" is identified as a verb while it is a noun.

I know how to work on the file grammar.xml, but this issue should be addressed elsewhere and I don't know where, nor how.

Sharcoux avatar Mar 19 '24 15:03 Sharcoux

I noticed a few more examples:


"Il prend café"

Token Lemma Part-of-speech
Il il R pers suj 3 m s
prend prendre V ind pres 3 s
café café J e sp / N m s

There is no way that café could be an adjective. There is not even a single name in the sentence...


"Il prend pelle"

Token Lemma Part-of-speech
Il il R pers suj 3 m s
prend prendre V ind pres 3 s
pelle pelle / peller N f s / V imp pres 2 s / V ind pres 1 s / V ind pres 3 s / V sub pres 1 s / V sub pres 3 s

How can a conjugated verb follow a conjugated verb that is not an auxiliary? What could be the subject of "V sub pres 1 s"? That doesn't seem to make any sense.

Sharcoux avatar Mar 21 '24 13:03 Sharcoux

In the first sentence, équipes is disambiguated correctly as a noun:

<S> Je[je/R pers suj 1 s] veux[vouloir/V ind pres 1 s] mieux[mieux/A] faire[faire/V inf] travailler[travailler/V inf] les[le/D e p,les/_GN_FP] équipes[équipe/N f p,équipes/_GN_FP] de[de/P] développement[développement/N m s] et[et/C coor] de[de/P] production[production/N f s,</S>]<P/>

jaumeortola avatar Mar 21 '24 15:03 jaumeortola

Interesting. It probably evolved recently I guess. The other 2 are still valid concerns, though.

Sharcoux avatar Mar 21 '24 16:03 Sharcoux

The words in the sentences start with all the tags present in the dictionary. Then, in disambiguation.xml, we select some tags. But this is a difficult process. It is even more difficult with sentences that can contain errors. In general, we do the minimal disambiguation necessary to make the grammar rules work well. If you really need to disambiguate those cases for some grammar rules, we can try to improve the disambiguation rules. But if it is not needed, we will not invest time on this.

jaumeortola avatar Mar 21 '24 16:03 jaumeortola

Oh, ok, perfect then!

I noticed multiple improvements that were needed on that part and I see that rules are quite easy to write. That should make some rules way more accurate. But I understand the concern about the fact that the sentence can contain mistakes and, thus, the disambiguation could lead to misinterpretation. However, I'm quite confident that I can get good results. I'm gonna try and open a few MRs if you're ok. I will probably need a few trial and error to get something consistent, though.

Sharcoux avatar Mar 21 '24 23:03 Sharcoux