goawk icon indicating copy to clipboard operation
goawk copied to clipboard

[WIP] Cache runes in chars mode

Open triallax opened this issue 5 months ago • 6 comments

WIP implementation of my proposal in https://github.com/benhoyt/goawk/issues/35#issuecomment-3111734804

unsophisticated and needs further testing and benchmarking, but already quite promising: running gron.awk on a big JSON file in chars mode goes from taking 12(!) minutes to 4 seconds on my laptop with this branch in its current form.

triallax avatar Jul 28 '25 01:07 triallax

using caching for the other functions (e.g. index) is a bit more complicated, because they require knowing which bytes in the string map to which rune. for example, to use the cached runes in e.g. index("مرحبا", "ر"), we need to use strings.Index which takes in a string and not a slice of runes, and then need to somehow map from the "byte index" to the rune/character index

an alternative in this case is to convert the needle to runes and search for it in the haystack slice of runes, but i'm not sure if that has subtle edge cases we may care about (e.g. string normalisation? does strings.Index even handle that? do we care or even want this in the first place?)

on the other hand even with just implementing this optimisation for length and substr we're already landing decent improvements, so maybe the status quo is fine

thoughts?

triallax avatar Jul 28 '25 01:07 triallax

@benhoyt gentle bump, no rush but i'd love to know how you think this looks for a prototype

triallax avatar Aug 15 '25 00:08 triallax

Hi @triallax, I'm not opposed to the approach itself. However, I guess the main thing I'm worried about is that it significantly slows down normal (bytes-mode) performance, due I suppose to all the additional allocations from the new([]rune) when creating each new value. I did a test locally (with simple.awk from https://github.com/benhoyt/countwords and kjvbible_x10.txt) and in bytes mode it was about twice as slow, or took twice as long, with this change.

We could try to have a switch in the interpreter which avoids the allocation if in bytes mode. The wiring for that might be annoying, but at least we could test performance. Even then I'd want to make sure the if/else in the hot path didn't slow bytes-mode down much -- GoAWK's been fairly focussed on performance, and I'd hate to regress much.

In any case, it'd be good if you could show some performance numbers, and we could come up with a way to avoid the extra new() / allocation for every string value.

benhoyt avatar Aug 15 '25 20:08 benhoyt

@triallax Any thoughts on my comment above?

benhoyt avatar Oct 26 '25 13:10 benhoyt

@triallax Any thoughts on https://github.com/benhoyt/goawk/pull/252#issuecomment-3192732174 above?

hi Ben, apologies I forgot this PR was a thing :D I agree with mostly everything you've said, I'll get around to posting performance statistics sometime this week

Even then I'd want to make sure the if/else in the hot path didn't slow bytes-mode down much

I suspect branch predictors would make this essentially a non-issue (given that per goawk execution, only one branch of the if else will ever be taken) but we'll see for sure once i do some benchmarks

triallax avatar Oct 26 '25 20:10 triallax

(sorry for the delay, been a lil busy, i promise i'll get around to this very soon)

triallax avatar Nov 06 '25 00:11 triallax