hobbes
hobbes copied to clipboard
issue with substring captures in regexes
m a = match a with | (?< p > . ) _ (?< m >CDF|SWP) _ PS _ . -> m | _ -> "0"
print(m("foo-_CDF_PS _ -bar"))
this prints out " _ SP _ ". This result doesn't look accurate.
That's a nice find, also pretty mysterious where that "SP" comes from since it's nowhere in the source string or the regex. I also reproduced this behavior, and for what it's worth I think it's the initial match-anything prefix capture that throws it off.
I had this interaction in 'hi':
> match "foo-_CDF_PS_-bar" with | '(?<p>.*)_(?<m>CDF|SWP)_PS_.*' -> m | _ -> "???"
"_SP_"
> match "foo-_CDF_PS_-bar" with | 'foo-_(?<m>CDF|SWP)_PS_.*' -> m | _ -> "???"
"CDF"