pcre icon indicating copy to clipboard operation
pcre copied to clipboard

Bad results using a back-reference

Open bhrgunatha opened this issue 10 months ago • 0 comments

V version V 0.4.9 78effd0 running on a full updated Arch Linux desktop.

A simple back-refernece e.g. to match double characters gives unexpected results.

>>> import pcre
>>> mut re2 := pcre.new_regex(r'(.)\1', 0) or { panic(err) } 
>>> matches := re2.match_str('ada or bb or xyzzy', 0, 0) or { panic(err) }
>>> println(matches)
pcre.MatchData{
    re: &C.pcre{}
    regex: &pcre.Regex{
        re: &C.pcre{}
        extra: &C.pcre_extra{}
        captures: 1
        options: 0
    }
    ovector: [7, 9, 7, 8, 0, 7]
    str: 'ada or bb or xyzzy'
    group_size: 2
    pos: 0
}

group_size: 2 seems rights but when I try to get the matches they are wrong.

>>> matches.get_all()
['b']
>>> matches.get(0)!
bb
>>> matches.get(1)!
b
>>> matches.get(2)!
V panic: result not set (Index out of bounds)
v hash: 78effd0
/tmp/v_1000/.noprefix.01JJXSD9V8SYD0S8Z4VC5GKB89.vrepl_temp.01JJXT1G9G2TH2XRH33K8Q8T0S.tmp.c:5335: at _v_panic: Backtrace
/tmp/v_1000/.noprefix.01JJXSD9V8SYD0S8Z4VC5GKB89.vrepl_temp.01JJXT1G9G2TH2XRH33K8Q8T0S.tmp.c:5301: by panic_result_not_set
/tmp/v_1000/.noprefix.01JJXSD9V8SYD0S8Z4VC5GKB89.vrepl_temp.01JJXT1G9G2TH2XRH33K8Q8T0S.tmp.c:8466: by main__main
/tmp/v_1000/.noprefix.01JJXSD9V8SYD0S8Z4VC5GKB89.vrepl_temp.01JJXT1G9G2TH2XRH33K8Q8T0S.tmp.c:8504: by main
>>> 

I expect the results to be "bb" and "zz"

Looking at the test module I try

>>> for m in matches {
...    println(m.group_size)
...    println(m.get(0)!)
...    println(m.get_all())
...    println(m)
...    println('')
... }
2
bb
['b']
pcre.MatchData{
    re: &C.pcre{}
    regex: &pcre.Regex{
        re: &C.pcre{}
        extra: &C.pcre_extra{}
        captures: 1
        options: 0
    }
    ovector: [7, 9, 7, 8, 0, 7]
    str: 'ada or bb or xyzzy'
    group_size: 2
    pos: 0
}

2
bb
['b']
pcre.MatchData{
    re: &C.pcre{}
    regex: &pcre.Regex{
        re: &C.pcre{}
        extra: &C.pcre_extra{}
        captures: 1
        options: 0
    }
    ovector: [7, 9, 7, 8, 0, 7]
    str: 'ada or bb or xyzzy'
    group_size: 2
    pos: 3
}

2
bb
['b']
pcre.MatchData{
    re: &C.pcre{}
    regex: &pcre.Regex{
        re: &C.pcre{}
        extra: &C.pcre_extra{}
        captures: 1
        options: 0
    }
    ovector: [7, 9, 7, 8, 0, 7]
    str: 'ada or bb or xyzzy'
    group_size: 2
    pos: 5
}

2
bb
['b']
pcre.MatchData{
    re: &C.pcre{}
    regex: &pcre.Regex{
        re: &C.pcre{}
        extra: &C.pcre_extra{}
        captures: 1
        options: 0
    }
    ovector: [7, 9, 7, 8, 0, 7]
    str: 'ada or bb or xyzzy'
    group_size: 2
    pos: 7
}

2
zz
['z']
pcre.MatchData{
    re: &C.pcre{}
    regex: &pcre.Regex{
        re: &C.pcre{}
        extra: &C.pcre_extra{}
        captures: 1
        options: 0
    }
    ovector: [15, 17, 15, 16, 0, 15]
    str: 'ada or bb or xyzzy'
    group_size: 2
    pos: 9
}

2
zz
['z']
pcre.MatchData{
    re: &C.pcre{}
    regex: &pcre.Regex{
        re: &C.pcre{}
        extra: &C.pcre_extra{}
        captures: 1
        options: 0
    }
    ovector: [15, 17, 15, 16, 0, 15]
    str: 'ada or bb or xyzzy'
    group_size: 2
    pos: 11
}

2
zz
['z']
pcre.MatchData{
    re: &C.pcre{}
    regex: &pcre.Regex{
        re: &C.pcre{}
        extra: &C.pcre_extra{}
        captures: 1
        options: 0
    }
    ovector: [15, 17, 15, 16, 0, 15]
    str: 'ada or bb or xyzzy'
    group_size: 2
    pos: 13
}

2
zz
['z']
pcre.MatchData{
    re: &C.pcre{}
    regex: &pcre.Regex{
        re: &C.pcre{}
        extra: &C.pcre_extra{}
        captures: 1
        options: 0
    }
    ovector: [15, 17, 15, 16, 0, 15]
    str: 'ada or bb or xyzzy'
    group_size: 2
    pos: 15
}
>>> 

I don't understand these results, they look wrong.

bhrgunatha avatar Jan 31 '25 12:01 bhrgunatha