c3c icon indicating copy to clipboard operation
c3c copied to clipboard

Speed up parsing using triehash.

Open data-man opened this issue 3 years ago • 2 comments

triehash is great! :) If you don't mind I can work on it.

data-man avatar Jan 10 '22 15:01 data-man

I am not sure if it will improve the symbol lookup that much, but if you want you can try it out.

lerno avatar Jan 11 '22 00:01 lerno

Yes, it's a half-step. My ideas:

  • replace fnv2 to faster hasher. According to SMhasher I'd choose komihash or pearsonB
  • replace hash table to rax or libart. rax is well tested on zillions of Redis servers :)
  • in the long term to use ctl

data-man avatar Jan 12 '22 12:01 data-man

One thing to understand is that profiling shows that the cost when loading the hash map a lot (lots of symbols) what costs time is mainly memory allocation. This is despite using a mem map implementation. When you request memory, it isn't really allocated until it's touched. And doing that touch is sufficiently slow that it's better to actually go through and touch a bunch of pages in a row than to wait until they're needed(!)

Anyway, the cost here is mainly due to the usage pattern, and a "faster hash map" doesn't really matter other than how the allocation pattern works. So essentially any off-the-shelf hash map will have worse tuning abilities than a custom one.

Yes fnv1a might not be fastest, but it's not the bottleneck and it's dead simple. The problem in terms of compile time will remain LLVM, clocking in at 1-2 magnitudes more expensive than anything else. Also, in order to tune things a much larger corpus of code is needed than even the standard library. I would like to see something in the order of 500.000 kloc to test on. And we don't have that yet.

lerno avatar Jun 13 '23 14:06 lerno

Hashing is not showing up anywhere, so changing hash is not going to help. What we need is a faster LLVM...

lerno avatar Mar 01 '24 13:03 lerno