libfsm
libfsm copied to clipboard
Deduplication of alts does not preserve capture groups
$ libfsm/build/bin/re -p -rpcre -lpcre '(foo)|(foo)'
(foo)
$ libfsm/build/bin/re -p -rpcre -lpcre '(foo)|(bar)'
(bar)|(foo)
In general, deduplicating alts is a good simple optimisation to speed up the generation of an FSM, but for regexes containing capture groups, the optimisation is not valid.
There's also reordering, and I suppose we either refrain from doing that, or we error about it as an unsupported situation.
This has definitely been addressed in my capture branch.