jq icon indicating copy to clipboard operation
jq copied to clipboard

Add PCRE2 support

Open wader opened this issue 2 months ago • 1 comments

This is still work in progress.

Work on oniguruma (https://github.com/kkos/oniguruma) has been discontinued so we need to find some other regex implementation.

To build and test: ./configure --with-oniguruma=no --with-pcre2 make make check ./jq --run-tests < ./tests/onig.test ./jq --run-tests < ./tests/pcre2.test

Known TODOs:

  • Proper configure options (skip vendor for pcre2?)
    • Support both for a while?
  • libjq usage? dont builtin and require libpcre2? (seems to be what debian does)
  • pcre2 in ci/release workflows
    • Drop some archs and use docker to simplify things?
    • For macos use native builder instead of cross compile?
  • Update Dockerfile
  • Update docs
    • Breaking changes
      • Drop "l" modifier?
        • Could possibly reimplement
      • Empty pattern and multi-byte code points behave differntly. I think pcre2 is more correct?
        • onig: jq -n '["🚀" | match(""; "g")] | length' -> 5 (per byte it seems)
        • pcre2: jq -n '["🚀" | match(""; "g")] | length' -> 2 (per code point)

Good references:

  • https://github.com/PCRE2Project/pcre2/blob/master/src/pcre2demo.c Pre pcre2_next_match usage:
  • https://github.com/PCRE2Project/pcre2/blob/eb3bd3cf1418cb1a0eabf984b0b1e80b6bdd9314~1/src/pcre2demo.c (pre pcre2_next_match usage)
  • pcre2test cli tools is a good Playaround for pcre2

Notes: ".+?\b" test in onig.test seems to be behaves differently depending on pcre2 version. I suspect this fix in 10.43: https://github.com/PCRE2Project/pcre2/commit/0a552803330e6eb4234c887a48a1584cb83c0666

I noticed a clang -fsanitize=memory use-of-uninitialized-valu issue but it seems to go awa with pcre2 master

Related to https://github.com/jqlang/jq/issues/3313

wader avatar Nov 05 '25 23:11 wader

Looking at the download stats maybe it would be ok to skip some arch? and then we could probably also gets away with using docker for cross builds.

$ curl -s https://api.github.com/repos/jqlang/jq/releases | jq -c '.[] | select(.tag_name == "jq-1.8.1").assets | map({name,download_count}) | sort_by(-.download_count)[]'
{"name":"jq-linux64","download_count":1557635}
{"name":"jq-linux-amd64","download_count":1175823}
{"name":"jq-windows-amd64.exe","download_count":264340}
{"name":"jq-osx-amd64","download_count":250878}
{"name":"jq-macos-amd64","download_count":218746}
{"name":"sha256sum.txt","download_count":208786}
{"name":"jq-macos-arm64","download_count":150022}
{"name":"jq-1.8.1.tar.gz","download_count":97404}
{"name":"jq-linux-arm64","download_count":88325}
{"name":"jq-win64.exe","download_count":73076}
{"name":"jq-windows-i386.exe","download_count":8321}
{"name":"jq-1.8.1.zip","download_count":3963}
{"name":"jq-linux-i386","download_count":2137}
{"name":"jq-linux-ppc64el","download_count":1778}
{"name":"jq-linux-s390x","download_count":1451}
{"name":"jq-linux-armhf","download_count":294}
{"name":"jq-linux-armel","download_count":194}
{"name":"jq-linux-mips64","download_count":156}
{"name":"jq-linux-mips","download_count":146}
{"name":"jq-linux-mips64el","download_count":140}
{"name":"jq-linux-riscv64","download_count":128}
{"name":"jq-linux-mips64r6el","download_count":110}
{"name":"jq-linux-mipsel","download_count":109}
{"name":"jq-linux-powerpc","download_count":109}
{"name":"jq-linux-mipsr6","download_count":105}
{"name":"jq-linux-mipsr6el","download_count":105}
{"name":"jq-linux-mips64r6","download_count":100}

skip armhf and below possibly?

wader avatar Nov 07 '25 16:11 wader