nim-regex
nim-regex copied to clipboard
regex very slow at CT
example 1
nim c -d:danger tests/tests.nim 51 seconds
tests/tests 0.04 seconds
notes
I'm not sure whether it's due to VM code executing tests at CT or whether it's the compilation time itself, but the slow compilation time is noticeable.
profiling
nim c --profileVM tests/tests.nim shows
prof: µs #instr location
20773272 4435 /Users/timothee/git_clone/nim/nim-regex/src/regex/compiler.nim(26, 6)
20679694 23062 /Users/timothee/git_clone/nim/nim-regex/src/regex/compiler.nim(10, 6)
18337658 468986 /Users/timothee/git_clone/nim/nim-regex/src/regex.nim(289, 8)
6562522 9757 /Users/timothee/git_clone/nim/nim-regex/src/regex/exptransformation.nim(500, 6)
6457045 12889 /Users/timothee/git_clone/nim/nim-regex/src/regex/litopt.nim(232, 6)
5824885 171468 /Users/timothee/git_clone/nim/nim-regex/src/regex/common.nim(69, 6)
5792159 192220 /Users/timothee/git_clone/nim/nim-regex/src/regex/common.nim(58, 6)
5709299 7983 /Users/timothee/git_clone/nim/nim-regex/src/regex/nfa.nim(315, 6)
5682670 326774 /Users/timothee/git_clone/nim/Nim_devel/lib/pure/strutils.nim(2679, 6)
5496339 12001524 /Users/timothee/git_clone/nim/Nim_devel/lib/pure/strutils.nim(2616, 6)
5082979 15966 /Users/timothee/git_clone/nim/nim-regex/src/regex/exptransformation.nim(487, 6)
4869667 736222 /Users/timothee/git_clone/nim/nim-regex/src/regex/nfa.nim(50, 6)
4473871 573475 /Users/timothee/git_clone/nim/nim-regex/src/regex/litopt.nim(140, 6)
3463356 876 /Users/timothee/git_clone/nim/nim-regex/src/regex.nim(473, 9)
3461240 15184 /Users/timothee/git_clone/nim/nim-regex/src/regex/nfamacro.nim(562, 6)
3249920 46364 /Users/timothee/git_clone/nim/nim-regex/src/regex/nodematch.nim(110, 6)
2801514 2670768 /Users/timothee/git_clone/nim/nim-unicodedb/src/unicodedb/types.nim(20, 6)
2593027 966009 /Users/timothee/git_clone/nim/nim-regex/src/regex/nfa.nim(161, 6)
2459262 238399 /Users/timothee/git_clone/nim/nim-regex/src/regex/exptransformation.nim(65, 6)
2201504 553 /Users/timothee/git_clone/nim/nim-regex/src/regex/nodematch.nim(10, 6)
2062921 1242678 /Users/timothee/git_clone/nim/nim-regex/src/regex/nfa.nim(196, 6)
1742430 269189 /Users/timothee/git_clone/nim/nim-regex/src/regex/parser.nim(742, 6)
1612111 188082 /Users/timothee/git_clone/nim/nim-regex/src/regex/nfa.nim(294, 6)
1498531 3173322 /Users/timothee/git_clone/nim/nim-unicodedb/src/unicodedb/types.nim(38, 10)
1376718 54650 /Users/timothee/git_clone/nim/Nim_devel/lib/system.nim(647, 6)
1033561 112395 /Users/timothee/git_clone/nim/nim-regex/src/regex/parser.nim(648, 6)
871969 281895 /Users/timothee/git_clone/nim/nim-regex/src/regex/exptransformation.nim(446, 6)
858437 327774 /Users/timothee/git_clone/nim/nim-regex/src/regex/litopt.nim(92, 6)
849408 237897 /Users/timothee/git_clone/nim/nim-regex/src/regex/exptransformation.nim(416, 6)
791580 168000 /Users/timothee/git_clone/nim/nim-regex/src/regex/nfa.nim(185, 6)
but it's to be taken with a grain of salt due to implementation of profileVM
example 2: self contained benchmark
this test shows a 20_000 X slowdown in VM
when defined case2:
#[
nim r -d:case2 -d:danger --benchmarkVM $timn_D/tests/nim/all/t11918.nim
nim r -d:case2 -d:danger --benchmarkVM --hints:off $timn_D/tests/nim/all/t11918.nim
(5.57392, 459)
(0.0003050000000000006, 459)
nim --eval:'echo 5.57392/0.0003050000000000006'
18275.14754098357
]#
import pkg/regex
import std/random
import std/times
proc main()=
# let n = 1000
let n = 2000
let m = 1
var r = initRand(123)
var s = newString(n)
for i in 0..<n:
s[i] = cast[char](r.rand(128))
var pat = r"\w+"
let reg = re(pat)
var c = 0
let t = cpuTime()
for j in 0..<m:
for i in 0..<s.len:
let a = startsWith(s, reg, i)
c += a.ord
let t2 = cpuTime() - t
echo (t2, c)
static: main()
main()
proposal
- JIT-compiling could be an option, enabled by https://github.com/timotheecour/Nim/issues/598; this is analog to how
vmops.hashVmImplspeeds up in VM a key part of hashing in std/hashes, by running native code instead of VM code
question
is there 1 or a few procs that could be run natively, so that the rest of code at CT would execute fast? In other words, is there an equivalent to hashes.hashVmImpl, hashes.hashVmImplByte in nim-regex?
ideally a proc that's:
- not generic (so that it's easy to dlopen)
- has easy to convert params between their value and corresponding PNode (see vmconv)
- is used by everything else so that performance at CT improves overall
IIRC, about 400ms is the compilation itself. The culprit is unicodedb, the unicode data tables take that long to compile. IC should fix that.
nim c -d:danger tests/tests.nim will run the 3k tests at compile time.
is there 1 or a few procs that could be run natively, so that the rest of code at CT would execute fast? In other words, is there an equivalent to hashes.hashVmImpl, hashes.hashVmImplByte in nim-regex?
No, I don't think there is. Params/result are too complex. I'll have this is mind when iterating on nregex, though. nregex improvements tend to be ported to nim-regex.
IC should fix that.
I'm not holding my breath ;-) ; maybe there's something that can be improved regardless of IC?
The culprit is unicodedb, the unicode data tables take that long to compile. IC should fix that.
that seems exactly like the kind of thing where a vmhook would help, assuming it indeed is a bottleneck, eg proc unicodeTypes*(cp: Rune): int = has simple input/output types
I'll have this is mind when iterating on nregex, though. nregex improvements tend to be ported to nim-regex.
TIL about https://github.com/nitely/nregex ; btw, do you have a page (github issue / doc whatever) containing a comparison of nregex vs nim-regex, when to use which one, etc?
that seems exactly like the kind of thing where a vmhook would help, assuming it indeed is a bottleneck, eg proc unicodeTypes*(cp: Rune): int = has simple input/output types
unicodedb has a few const fooTable = [big array], that's what takes 400ms to compile. To reproduce it, you just need to import pkg/unicodedb/properties, and pkg/unicodedb/types, there is no need to call any function. If I understand vmhook only works on functions, no?
TIL about https://github.com/nitely/nregex ; btw, do you have a page (github issue / doc whatever) containing a comparison of nregex vs nim-regex, when to use which one, etc?
nregex is just a playground for my regex experiments, don't use it. I'm working on a TDFA, and I'll likely break all APIs. I think it only makes sense to use it for very simple regex matching. Find/findAll are slow. Captures/submatches, and assertions (\b, $, ^) will use an hybrid DFA/NFA and it's fairly slow. The biggest difference will be once I improve the submatches space complexity, but that's just one part of the puzzle. So, ignore it for now.
that's what takes 400ms to compile.
ok, I was more concerned about the 51 seconds for nim c -d:danger tests/tests.nim, not those 400ms (1 time cost). And IC won't help with those 51 seconds, what's at play is slow VM execution of those tests at CT, for which vmhooks could help.
If I understand vmhook only works on functions, no?
you can also dlopen variables, so values would work too.
If there's no obvious internal proc that's called by all others, we can also wrap user facing API
example: replace
we have:
func replace(s: string; pattern: Regex; by: string; limit = 0): string
the problem is Regex object which is complex
option 1: easy
1 easy option is to add an overload:
func replace(s: string; pattern: string; by: string; limit = 0): string {.vhmhook.}
when called at CT (eg via const a = replace("foo", "fo\w+", ".", 1)), replace would then run with native code, calling pattern(s, re(pattern), by, limit)
this will definitely work and wouldn't be hard, but it adds an overload and won't work transparently for existing code
option 2
we could avoid need for adding an overload with pattern: string; the nim compiler already does something similar with {.magic.} types, which allows defining them specially at CT:
type
UncheckedArray*[T]{.magic: "UncheckedArray".}
type
Ordinal*[T] {.magic: Ordinal.} ## Generic ordinal type. Includes integer,
type
NimNodeObj = object
NimNode* {.magic: "PNimrodNode".} = ref NimNodeObj
I don't have a complete answer for option 2 but it should be possible
Are you concerned about the tests being slow, or about nim-regex being slow at CT? It's not the same, because the current APIs need to be tested at CT, so even if we implement option 1, it won't make the tests fast.
The issue I see is there is no way to compile the pattern at compile time, at least not with option 1.
Are you concerned about the tests being slow, or about nim-regex being slow at CT?
I'm not concerned about the tests's running time, since they don't affect user code; I'm concerned about nim-regex being slow at CT as it can affect user code.
the current APIs need to be tested at CT, so even if we implement option 1, it won't make the tests fast.
i agree with that.
The issue I see is there is no way to compile the pattern at compile time, at least not with option 1.
option 1 can work like this:
proc main =
let a = replace(input, "\w+", "-")
static: main()
replace then gets intercepted using similar registration mechanism as in vmops, which then calls:
# this will be compiled into a dll and dlopened (once) via vmhook mechanism:
proc replaceVmHook(input, pattern, by): string =
result = replace(input, re(pattern), by)
which runs at CT (from the perspective of client code) but using native code.
Furthermore, replaceVmHook can cache re(pattern) so that subsequent invocations with the same pattern can be reused, but I'm not sure this optimiation is needed.
What I mean is that API will compile the regex at runtime every time it's called, when called at runtime.
What I mean is that API will compile the regex at runtime every time it's called, when called at runtime.
I think this can be workaround if we can do something like:
when nimvm:
func replace(s: string; pattern: string; by: string; limit = 0): string {.vhmhook.}
else:
func replace(s: string; pattern: static string; by: string; limit = 0): string
replaceImpl # compile the regex at CT?
or even without the when clause, if overloading works in this case (idk).
something like this could work, overloading on whether pattern is a static param:
func replace(s: string; pattern: string; by: string; limit = 0): string {.vhmhook.}
func replace(s: string; pattern: static string; by: string; limit = 0): string {.vhmhook.}
or simply require user to make this explicit, eg:
func replace(s: string; pattern: string; by: string; limit = 0): string {.vhmhook.}
func replace(s: string; pattern: CtReg; by: string; limit = 0): string {.vhmhook.}
const r: CtReg = re"\w+" # now pkg/regex doesn't need to deal with caching, it's user responsability
let a = replace("ab", r, "cd")
const r: CtReg = re"\w+" # now pkg/regex doesn't need to deal with caching, it's user responsability
Is CtReg a complex type? I thought vhmhook would only support primitive types. If it's not a complex type, then it's not really caching the compiled regex.
This is a valid use case that cannot be supported unless the API takes a compiled regex:
func findInFiles(pattern: string, files: seq[File]): string =
let reg = re(pattern) # compile pattern
for file in files:
if find(file.content, reg):
return file.name
if we only have find(s: string, pattern: string): bool then there is no way of caching the compiled regex.