Fable icon indicating copy to clipboard operation
Fable copied to clipboard

allow RegexOptions.NonBacktracking

Open ieviev opened this issue 8 months ago • 9 comments

This lets you use RegexOptions.NonBacktracking in combination with RegexOptions.ECMAScript as the js equivalent linear l flag for regex.

l is limited but has vastly better performance characteristics for e.g., IsMatch

A small example for comparison:

open System.Text.RegularExpressions
open Fable.Core.JS

let linear_regex(s: string) =
    Regex($"(a+)*b$", RegexOptions.NonBacktracking ||| RegexOptions.ECMAScript).Match(s)

let backtracking_regex(s: string) = Regex($"(a+)*b$").Match(s)

let input = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaba"

let measurefn fn =
    let start = Constructors.Date.now ()
    let _ = fn ()
    let stop = Constructors.Date.now ()
    console.log ($"time taken {stop - start} ms")

console.log "linear:"
measurefn (fun v -> linear_regex (input))

console.log "backtracking:"
measurefn (fun v -> backtracking_regex (input))

tmp

the l flag is not supported together with u in JS so i'm unsure if RegexOptions.NonBacktracking should be allowed by itself.

ieviev avatar Apr 24 '25 12:04 ieviev

Hello @ieviev,

Thank you for the PR could you please some tests too?

They can be placed in https://github.com/fable-compiler/Fable/blob/main/tests/Js/Main/RegexTests.fs

MangelMaxime avatar Apr 24 '25 18:04 MangelMaxime

I added some tests and found that .NET does not support RegexOptions.ECMAScript with any other option, so the alternative is to just support RegexOptions.NonBacktracking by itself.

This means RegexOptions.NonBacktracking has different unicode behavior in .NET and JS

ieviev avatar Apr 24 '25 19:04 ieviev

This feature is locked behind '--enable-experimental-regexp-engine' for node or a new version of v8, i'm not sure what to do about the tests here

ieviev avatar Apr 25 '25 11:04 ieviev

This feature is locked behind '--enable-experimental-regexp-engine' for node or a new version of v8, i'm not sure what to do about the tests here

Do you know which version of Node has support for it out of the box?

This means RegexOptions.NonBacktracking has different unicode behavior in .NET and JS

I am not sure of the exact impact, but in general if the behavior is similar in JS and .NET we are okay with it. In some cases, we are also okay with some small difference and document them on Fable docs for example.

For regex, the current docs says (it could be outdated):

Regex will always behave as if passed RegexOptions.ECMAScript flag (e.g., no negative look-behind or named groups).

MangelMaxime avatar Apr 25 '25 13:04 MangelMaxime

Do you know which version of Node has support for it out of the box?

According to this --enable-experimental-regexp-engine should be available for nodejs versions 16 and up https://github.com/nodejs/node/issues/38297

Maybe this could be passed into the mocha test runner?

Regex will always behave as if passed RegexOptions.ECMAScript flag (e.g., no negative look-behind or named groups).

That's very good, if the regex is always assumed to have ECMAScript in js then there should be no problem.

ieviev avatar Apr 25 '25 13:04 ieviev

Do you know which version of Node has support for it out of the box?

According to this --enable-experimental-regexp-engine should be available for nodejs versions 16 and up nodejs/node#38297

There is no version of Node with support for it out of the box?

Maybe this could be passed into the mocha test runner?

Because, we don't specify a version for Node in Github CI, I suppose it is using LTS so we should be good and just have to provide the flag to mocha test runner like you proposed.

MangelMaxime avatar Apr 25 '25 14:04 MangelMaxime

There is no version of Node with support for it out of the box?

There's many sources hinting it will be enabled by default some time in both node and chrome as well (https://zenodo.org/records/10806044) but i have not seen it enabled anywhere by default yet, i think google-chrome --js-flags=--enable-experimental-regexp-engine or node --enable-experimental-regexp-engine is required currently. I'm uncertain when this will become the default

ieviev avatar Apr 25 '25 14:04 ieviev

Adding the flag is good enough to me

MangelMaxime avatar Apr 26 '25 14:04 MangelMaxime

/run fantomas

MangelMaxime avatar May 01 '25 19:05 MangelMaxime