aho-corasick icon indicating copy to clipboard operation
aho-corasick copied to clipboard

RFC: remove configuration knobs

Open BurntSushi opened this issue 4 years ago • 6 comments

This RFC is basically a mirror of https://github.com/BurntSushi/regex-automata/issues/7

The idea here is that I would like to make premultiply=true, byte_classes=true be the only option that one can choose for DFAs. (It is currently the default for DFAs.) I think it was a mistake to expose these knobs, as disabling them doesn't really confer benefits (if any). But they are quite costly in terms of code size and clarity.

Are there any strenuous objections to removing these options?

BurntSushi avatar Mar 28 '20 14:03 BurntSushi

I'm not sure how many users of aho-corasick will read this here. Maybe just go ahead and deprecate it? If there are a lot of objections, you can always revert the deprecation.

vks avatar Apr 25 '21 21:04 vks

Yeah I should have updated this. I'm not holding myself up here as a result of the lack of feedback. My plan is to just to go forward with this. Just haven't got a chance to do it. It's not super high priority.

BurntSushi avatar Apr 25 '21 21:04 BurntSushi

Deprecating them sounds smart though. Might be a good forcing function.

BurntSushi avatar Apr 25 '21 21:04 BurntSushi

Sure, just removing it in the next breaking release also works.

I just thought that deprecating would be the most reliable way to get potential objections. FWIW, I cannot find any code on GitHub setting premultiply(false). For bytes_classes(false), I only found one use in tokei.

vks avatar Apr 25 '21 22:04 vks

@vks Yeah, there are some cases in which disabling byte classes can make the resulting search faster, but I've only ever seen slight improvements. (I've also seen regressions due to the increased size in the automaton.)

BurntSushi avatar Apr 25 '21 22:04 BurntSushi

In the next semver incompatible release, I am also planning to remove the ability to construct automatons using a smaller state ID representation. Instead, it will be fixed to u32 with no way to change it. See the corresponding comment on the regex-automata issue for more details: https://github.com/BurntSushi/regex-automata/issues/7#issuecomment-846627662

BurntSushi avatar May 23 '21 21:05 BurntSushi