Ricardo Joseh Lima

Results 38 comments of Ricardo Joseh Lima

For Portuguese the same happens: "as palavras estão corretos" is not caught. I would like to see, if possible, some false positives, maybe I could contribute in finding some pattern...

Hi @jaumeortola (and @jaumeortola and @susanaboatto). I have a created a tentative rule for "as palavras estão corretos": !-- Portuguese rule, 2022-09-02 -- rule id="LIGAO" name="Ligação" pattern token postag='SPS00:DA0FP0' negate_pos='yes'>...

Hi @marcoagpinto I didn't understand the details of the rule but it is supposed to apply only to the verbs ser and estar: Elas estão bonitos hoje As palavras são...

Hmmm, looking at the 20.txt I didn't find the rule useful. It only corrected one instance (Eles são altos?), all others are cases of "a gente" which in pt-br admits...

Hi take a look at my original proposal, repeated below !-- Portuguese rule, 2022-09-02 -- rule id="LIGAO" name="Ligação" pattern token postag='SPS00:DA0FP0' negate_pos='yes'> token postag='NCFP000'> token>estão token postag='AQ0MP0'> /pattern message>conserte o...

I know it, and I said it in my original comment: it must be adapted to all forms of estar and ser. Can the lemma of these verbs be used...

That was unexpected, can you show me the false positives with only this slight change - estão to all forms of estar and ser?

I see... The problem seems to lie here token postag='DA0F.+|SPS00:DA0F.+|PP.F.+|SPS00:PP.F.+' postag_regexp="yes" It should prevent na, da, in the rule so "na mala" wouldn't be captured but "a mala" yes.

Looks cool! But for pt-br it serves for formal, academic and so on

Solved. I ran the program in the terminal and from there redirected to a file.