languagetool
languagetool copied to clipboard
[fr] Incorrect grammar analysis for some nouns incorrectly identified as verbs
In the sentence: Je veux mieux faire travailler les équipes de développement et de production
, "équipes" is identified as a verb while it is a noun.
I know how to work on the file grammar.xml
, but this issue should be addressed elsewhere and I don't know where, nor how.
I noticed a few more examples:
"Il prend café"
Token | Lemma | Part-of-speech |
---|---|---|
Il | il | R pers suj 3 m s |
prend | prendre | V ind pres 3 s |
café | café | J e sp / N m s |
There is no way that café could be an adjective. There is not even a single name in the sentence...
"Il prend pelle"
Token | Lemma | Part-of-speech |
---|---|---|
Il | il | R pers suj 3 m s |
prend | prendre | V ind pres 3 s |
pelle | pelle / peller | N f s / V imp pres 2 s / V ind pres 1 s / V ind pres 3 s / V sub pres 1 s / V sub pres 3 s |
How can a conjugated verb follow a conjugated verb that is not an auxiliary? What could be the subject of "V sub pres 1 s"? That doesn't seem to make any sense.
In the first sentence, équipes
is disambiguated correctly as a noun:
<S> Je[je/R pers suj 1 s] veux[vouloir/V ind pres 1 s] mieux[mieux/A] faire[faire/V inf] travailler[travailler/V inf] les[le/D e p,les/_GN_FP] équipes[équipe/N f p,équipes/_GN_FP] de[de/P] développement[développement/N m s] et[et/C coor] de[de/P] production[production/N f s,</S>]<P/>
Interesting. It probably evolved recently I guess. The other 2 are still valid concerns, though.
The words in the sentences start with all the tags present in the dictionary. Then, in disambiguation.xml
, we select some tags. But this is a difficult process. It is even more difficult with sentences that can contain errors.
In general, we do the minimal disambiguation necessary to make the grammar rules work well.
If you really need to disambiguate those cases for some grammar rules, we can try to improve the disambiguation rules. But if it is not needed, we will not invest time on this.
Oh, ok, perfect then!
I noticed multiple improvements that were needed on that part and I see that rules are quite easy to write. That should make some rules way more accurate. But I understand the concern about the fact that the sentence can contain mistakes and, thus, the disambiguation could lead to misinterpretation. However, I'm quite confident that I can get good results. I'm gonna try and open a few MRs if you're ok. I will probably need a few trial and error to get something consistent, though.