languagetool icon indicating copy to clipboard operation
languagetool copied to clipboard

Optional token (min=0 max=1) propagates tags from one matched token to another

Open jimregan opened this issue 5 years ago • 1 comments

Input:

/[null]SENT_START go/[go]Conj:Subord|go/[go]Part:Vb:Cmpl|go/[go]Part:Vb:Subj|go/[go]Prep:Simp  /[null]null n-éirí/[éirigh]Verb:VI:Vow:PresSubj:Ecl|n-éirí/[éirí]Noun:Masc:Com:Sg:Ecl|n-éirí/[éirí]Noun:Masc:Gen:Sg:Ecl|n-éirí/[éirí]Verbal:Noun:VI:Ecl|n-éirí/[éirí]Verbal:Noun:VI:Gen:Ecl

With this rule:

    <rule id="GO_SUBJ" name="go SUBJ">
        <pattern>
            <token postag="SENT_START"></token>
            <token min="0" max="1" regexp="yes">&interp;</token>
            <marker>
                <token postag="Part:Vb:Subj">go</token>
                <token postag=".*Verb.*PresSubj.*" postag_regexp="yes"></token>
            </marker>
        </pattern>
        <disambig action="filterall"/>
    </rule>

I get this output:

/[null]SENT_START go/[go]Part:Vb:Subj  /[null]null n-éirí/[éirigh]Part:Vb:Subj|n-éirí/[éirigh]Part:Vb:Subj|n-éirí/[éirigh]Part:Vb:Subj|n-éirí/[éirigh]Part:Vb:Subj|n-éirí/[éirigh]Part:Vb:Subj

(i.e., the PoS tags from 'go' are propagated to 'éirigh')

The same thing without the optional token:

    <rule id="GO_SUBJ" name="go SUBJ">
        <pattern>
            <token postag="SENT_START"></token>
            <marker>
                <token postag="Part:Vb:Subj">go</token>
                <token postag=".*Verb.*PresSubj.*" postag_regexp="yes"></token>
            </marker>
        </pattern>
        <disambig action="filterall"/>
    </rule>

works as expected:

/[null]SENT_START go/[go]Part:Vb:Subj  /[null]null n-éirí/[éirigh]Verb:VI:Vow:PresSubj:Ecl

jimregan avatar Nov 03 '19 11:11 jimregan

Hello, Jim!

So, you are suggesting to keep without the optional token? Since it works as expected.

And I also would like to know if this issue is solved, so we can update this.

srcmilena avatar Jul 26 '22 11:07 srcmilena