findutils icon indicating copy to clipboard operation
findutils copied to clipboard

consider making onig optional

Open lu-zero opened this issue 2 years ago • 10 comments

There are a number of pure rust globs and regex crates, might be better to not have to deal with a C dependency if we could avoid it.

From what I'm seeing regex may lack a mean to select a specific flavour of regex, not sure if somebody already had a mean to restrict the engine to not support extensions compared to posix/emacs.

lu-zero avatar Feb 25 '23 20:02 lu-zero

Yeah, compatibility is generally why we use onig instead of, say, the standard regex crate. I think we could probably make it optional as the additional features aren't used that much (in coreutils as well). It'd be great if there was some regex api that allowed us to switch with just a feature flag and wouldn't require us to write both versions everywhere.

tertsdiepraam avatar Feb 26 '23 12:02 tertsdiepraam

ripgrep abstracted quite a bit so it can use pcre2 or regex.

lu-zero avatar Feb 26 '23 12:02 lu-zero

https://gitlab.redox-os.org/redox-os/posix-regex might be a good option

tavianator avatar Feb 27 '23 14:02 tavianator

I was convinced that regex supports a superset of the posix one. Probably would be good to make a table of what is supported and what is not. And do the same for the glob crates...

lu-zero avatar Feb 27 '23 15:02 lu-zero

regex is not a superset of either Posix BREs or EREs, since they both support back-references ((f)ire\1ox) while regex does not. I agree a comparison table would be nice.

Our glob implementation converts globs to Posix BREs, so any POSIX-compatible regex implementation gets us globs for free: https://github.com/uutils/findutils/blob/main/src/find/matchers/glob.rs

tavianator avatar Feb 27 '23 22:02 tavianator

onig is still not updated, and clang-16 is going to hit more distributions. Given that upstream seems unresponsive should we start looking for alternatives more actively?

lu-zero avatar Mar 26 '23 11:03 lu-zero

Am I missing some part of the conversation? What's going on with clang-16?

tertsdiepraam avatar Mar 26 '23 14:03 tertsdiepraam

clang-16 makes onig non-buildable, I sent a patch to fix it more or less as I opened this issue.

lu-zero avatar Mar 27 '23 07:03 lu-zero

Oh that's unfortunate. I wonder if https://crates.io/crates/fancy-regex be a good alternative?

tertsdiepraam avatar Mar 27 '23 08:03 tertsdiepraam

That explains the error I'm getting when building this using MSYS2 / UCRT64, as that project recently updated to Clang 16.

brisingraerowing avatar Apr 06 '23 02:04 brisingraerowing