languagetool
languagetool copied to clipboard
[pt] Rule “por isso existe(m)” → “existindo”
Hello @ricardojosehlima
I believe I already wrote about this rule in the forum weeks or months ago, but here it comes again.
Look at the sentence:
"Os bits são o nível mais baixo existente nos computadores, por isso existem em maior quantidade."
“por isso + Verb_two_forms_possible”
→ “verb_gerund”
.
This would simplify the writing, and I would like to test it to see the results.
Do you have a good name ID, rule name and suggestion message for such a rule?
Thank you!
Hi! Would this apply to any verb? I thought "Eles chegaram atrasados, por isso estão aqui." --> "Eles chegaram atrasados, estando aqui" is not the same thing.
@ricardojosehlima
Hello dear brother!
❤️ ❤️ ❤️ ❤️ ❤️ ❤️ ❤️
You are back!
I am sure it won't apply to all verbs, that is why I want to test it against 600 000 sentences and remove the verbs it doesn't apply to, or, if easier, add only the “good” verbs after the testing.
Thanks!
Haha, lots of extra work here this week, will be back to normal only next week. Ok for testing it, it is a good idea.
Ricardo!
Next week, when you have the time, please suggest a good rule ID, rule name and suggestion message for this rule.
Thanks!
😄 😄 😄 😄 😄 😄 😄 😄
@ricardojosehlima
Hello!
Are you busy?
Any ideas for a rule ID, rule name and suggestion message?
Thanks!
Hi, sorry for the delay (they'll become more frequent, unfortunately, due to the new obligations in my job) rule_ID = existir_simplificação rule name = Simplificação de construção com verbo existir suggestion = Você pode usar a forma no gerúndio nesse caso
@ricardojosehlima
!!!!
I am just finishing the rule, and it works with all verbs.
So, we will need a better rule ID and rule name.
I have created a “provisory” name for it during the tests.
<!-- POR ISSO EXISTE/EXISTEM existindo -->
<rule id='POR_ISSO_EXISTE_EXISTEM_EXISTINDO' name="Simplicar: Por isso existe/existem → existindo" type='style'>
<!-- Created by Marco A.G.Pinto with Ricardo Joseh Lima suggestions, Portuguese rule 2022-08-15 (25-JUL-2022+) -->
<!--
Os bits são o nível mais baixo existente nos computadores, por isso existem em maior quantidade. → Os bits são o nível mais baixo existente nos computadores, existindo em maior quantidade.
-->
<pattern>
<token>por
<exception scope='previous' postag_regexp='yes' postag='SENT_START|_QUOT|CS|RM|RN|RG|V.+'/>
</token>
<token>isso</token>
<token postag='VMIP3.+' postag_regexp='yes'>
<exception scope='next' postag_regexp='yes' postag='VMG0000|_PUNCT|VMIP3.+|VMM02.+'/>
<exception scope='next' regexp='yes'>que|tanto</exception>
</token>
</pattern>
<message>Esta perífrase poderá ser simplificada.</message>
<suggestion><match no='3' postag='VMIP3.+' postag_regexp="yes" postag_replace='VMG0000'/></suggestion>
<example correction="existindo">Os bits são o nível mais baixo existente nos computadores, <marker>por isso existem</marker> em maior quantidade.</example>
</rule>
There is a false positive which I believe it happens because maybe I have an outdate added.txt in my Wikipedia tool folder:
Esse político, acusado de corrupção, tem muito dinheiro e um bom advogado, por isso nada teme.
(the verb "temer")
Portuguese (Portugal): 62 total matches
Portuguese (Portugal): ø0.00 rule matches per sentence
Portuguese (Portugal): 17356 input lines ignored (e.g. not between 10 and 300 chars or at least 4 tokens)
Do you believe all is fine?
Thanks!
@ricardojosehlima
I have found a good ID and name for the rule and committed it: https://github.com/languagetool-org/languagetool/commit/8b00baaa420fc7be7609fae108c65ee3895de0e9
❤️ ❤️ ❤️ ❤️ ❤️ ❤️
Hi, sorry for the delay. In the file there are sentences like O meio ambiente está sempre nos servindo com sua fonte de vida que não é inesgotável e por isso precisa de ajuda! where por isso comes after the conjunction "e". Here and in the other cases I think that the rule shouldn't apply.
As for the name of the rule name="Simplicar: Por isso + V. 3.ª Sing/Plural → V. Gerúndio" type='style'>
must be Simplificar, right?
Hello @ricardojosehlima
Thanks, I believe we have a good name for it.
I will fix the “e” conjunction at 5am, right now, I am very stressed… 😄 😄 😄 😄 😄
My life is an adventure… the things I can't say in public… 😄 😄 😄
Thank you very much!
@ricardojosehlima
I can't sleep, so I am back to the projects.
I have implemented the “e” conjunction: https://github.com/languagetool-org/languagetool/commit/2ff47a986671f4528dfe599d112ab77959d48f89
Before (with the latest nightly standalone and Wikipedia tools):
Portuguese (Portugal): 61 total matches
Portuguese (Portugal): 582235 total sentences considered
Portuguese (Portugal): ø0.00 rule matches per sentence
After (with “e” removed):
Portuguese (Portugal): 40 total matches
Portuguese (Portugal): 582235 total sentences considered
Portuguese (Portugal): ø0.00 rule matches per sentence
1_after_20220817_removing_E.txt
There is a difference of 21 sentences without the “e”.
Are they all valid?
❤️ ❤️ ❤️ ❤️ ❤️ ❤️ ❤️
Thank you, my dear brother!
If all is valid, tomorrow I will suggest a new rule which seems easy to implement and later a very powerful rule, which if we make it work, it will be a kind of breakthrough in grammar checking (I have it written down for months, like many others, and right now, I have the knowledge on how to code it).
Hi, the rule is now fine! And I can't wait to know what this rule you've been working on makes!
❤️
I will first open a ticket with a simple idea for a rule.
Meanwhile, while revising my thesis, I came up with more ideas for rules… lol 😄 😄 😄 😄 😄 😄 😄
Every little detail gives ideas for rules, this way the thesis will never be finished. 😄 😄