cperl icon indicating copy to clipboard operation
cperl copied to clipboard

switch to PCRE2 with jit (evtl. Hyperscan with fallback)

Open rurban opened this issue 8 years ago • 0 comments

PCRE2 is about 40% faster, more compatible than perl5 in most cases, has an acceptable code quality unlike our old slow 2-pass spencer regcomp/regexec code with setjmp/longjmp logic and iteration over recursion, has minor cornercase problems: empty \N names, \B{}, see http://www.pcre.org/current/doc/html/pcre2compat.html, only one major blocking bug https://github.com/rurban/re-engine-PCRE2/issues/15 and overall almost less regex bugs than older perls. Which is relevant when writing portable code. See https://github.com/rurban/re-engine-PCRE2/#failing-tests

For the cornercases we can always fallback to the old core re engine, maybe as dynamic module. In the current tests it's the other way round. PCRE2 as dynamic module falls back to core. Need stats (dtrace probes?)

With boolean matches only, without capture groups, backtracking, ... and only on Intel CPUs, also try use the much faster Hyperscan engine, if provided by the system (c++ and very new Boost). E.g. only the latest 2 Ubuntu's, not the LTE, not on travis. See https://github.com/rurban/re-engine-Hyperscan and https://rust-leipzig.github.io/regex/2017/03/28/comparison-of-regex-engines/ (3x faster than pcre2-jit)

rurban avatar Apr 09 '17 09:04 rurban