legacy
legacy copied to clipboard
Regex results do not match nesting of groups in `ThompsonEngine`
Version
110.99.4 (Latest)
Operating System
- [X] Any
- [ ] Linux
- [ ] macOS
- [ ] Windows
- [ ] Other Unix
OS Version
No response
Processor
- [X] Any
- [ ] Arm (using Rosetta)
- [ ] PowerPC
- [ ] Sparc
- [ ] x86 (32-bit)
- [ ] x86-64 (64-bit)
- [ ] Other
System Component
SML/NJ Library
Severity
Minor
Description
From the MatchTree
structure:
The tree structure corresponds to the nesting of groups in the regular expression.
However this does not seem to be implemented.
Transcript
- val re = Option.valOf (StringCvt.scanString AwkSyntax.scan "(a(b)a)c")
val re = Concat [Group (Alt [Concat [Char #"a", Group (Alt [#]), Char #"a"]]), Char#"c"] : syntax
- StringCvt.scanString (ThompsonEngine.find (ThompsonEngine.compile re)) "abac";
val it = SOME (Match ({len=4,pos=0}, [])) : StringCvt.cs match option
Expected Behavior
- val re = Option.valOf (StringCvt.scanString AwkSyntax.scan "(a(b)a)c")
val re = Concat [Group (Alt [Concat [Char #"a", Group (Alt [#]), Char #"a"]]), Char#"c"] : syntax
- StringCvt.scanString (ThompsonEngine.find (ThompsonEngine.compile re)) "abac";
val it = SOME (Match ({len=4,pos=0}, [Match ({len=3,pos=0}, [Match ({len=1,pos=1}, [])])])) : StringCvt.cs match option
Steps to Reproduce
See transcript
Additional Information
I'm not sure if the regexp module is used, since the BackTrackEngine
doesn't seem to be maintained.
If this is the intended behavior, I would just like to see the expected behavior documented properly.