hobbes icon indicating copy to clipboard operation
hobbes copied to clipboard

issue with substring captures in regexes

Open smunix opened this issue 6 years ago • 1 comments

m a = match a with | (?< p > . ) _ (?< m >CDF|SWP) _ PS _ . -> m | _ -> "0"

print(m("foo-_CDF_PS _ -bar"))

this prints out " _ SP _ ". This result doesn't look accurate.

smunix avatar Jul 13 '18 00:07 smunix

That's a nice find, also pretty mysterious where that "SP" comes from since it's nowhere in the source string or the regex. I also reproduced this behavior, and for what it's worth I think it's the initial match-anything prefix capture that throws it off.

I had this interaction in 'hi':

> match "foo-_CDF_PS_-bar" with | '(?<p>.*)_(?<m>CDF|SWP)_PS_.*' -> m | _ -> "???"
"_SP_"
> match "foo-_CDF_PS_-bar" with | 'foo-_(?<m>CDF|SWP)_PS_.*' -> m | _ -> "???"
"CDF"

kthielen avatar Jul 13 '18 12:07 kthielen