legacy icon indicating copy to clipboard operation
legacy copied to clipboard

Regex results do not match nesting of groups in `ThompsonEngine`

Open Skyb0rg007 opened this issue 11 months ago • 2 comments

Version

110.99.4 (Latest)

Operating System

  • [X] Any
  • [ ] Linux
  • [ ] macOS
  • [ ] Windows
  • [ ] Other Unix

OS Version

No response

Processor

  • [X] Any
  • [ ] Arm (using Rosetta)
  • [ ] PowerPC
  • [ ] Sparc
  • [ ] x86 (32-bit)
  • [ ] x86-64 (64-bit)
  • [ ] Other

System Component

SML/NJ Library

Severity

Minor

Description

From the MatchTree structure:

The tree structure corresponds to the nesting of groups in the regular expression.

However this does not seem to be implemented.

Transcript

- val re = Option.valOf (StringCvt.scanString AwkSyntax.scan "(a(b)a)c")
val re = Concat [Group (Alt [Concat [Char #"a", Group (Alt [#]), Char #"a"]]), Char#"c"] : syntax
- StringCvt.scanString (ThompsonEngine.find (ThompsonEngine.compile re)) "abac";
val it = SOME (Match ({len=4,pos=0}, [])) : StringCvt.cs match option

Expected Behavior

- val re = Option.valOf (StringCvt.scanString AwkSyntax.scan "(a(b)a)c")
val re = Concat [Group (Alt [Concat [Char #"a", Group (Alt [#]), Char #"a"]]), Char#"c"] : syntax
- StringCvt.scanString (ThompsonEngine.find (ThompsonEngine.compile re)) "abac";
val it = SOME (Match ({len=4,pos=0}, [Match ({len=3,pos=0}, [Match ({len=1,pos=1}, [])])])) : StringCvt.cs match option

Steps to Reproduce

See transcript

Additional Information

I'm not sure if the regexp module is used, since the BackTrackEngine doesn't seem to be maintained. If this is the intended behavior, I would just like to see the expected behavior documented properly.

Email address

[email protected]

Skyb0rg007 avatar Mar 05 '24 23:03 Skyb0rg007