Improve memory consumption of the log sampler
Hi!
The zapcore/sampler.go allocates ~0.5MB per instance.
That's quite a bit considering that some binaries instantiate quite a few. For example, OpenTelemetry collectors monitoring Kubernetes cluters can instantiate tens of thousands of those.
Here I suggest an update that divides the memory consumption by 7 and improves performance, with very little drawback.
Details
The sampler allocates numberOfLogLevels * 4096 instances of the counter structure (128 bits). There are 7 different log levels. Once an log message is emitted, it's hashed to one of those 7*4096 buckets.
In practice, all 7 log levels are not used uniformly (far from it), and some (e.g. DPanic, Panic, Fatal) re barely used at all.
I thus propose to only instantiate 4096 instances of the counter structure (dividing the memory footprint in 7) and include the log level as an input to the hash function. Since the 7 log levels are use very unevenly, this only marginally impact the risks of collision while improving cache locality.
I'm not very familiar with this, but my understanding is that all this is in virtual memory. So unused log levels like debug or fatal should not matter for RSS.
I would even wonder if we hit all the 4096 buckets per level in most applications (?) - especially if you initialize thousands of samplers per process, presumably each one of them would use very few of the buckets (?).
Do you have evidence of this really impacting OpenTelemetry collectors badly?
Indeed hopefully completely unused log levels hopefully make it to virtual memory. The number of entries of not completely unused log levels is highly unbalanced though (info >> warning >> error): I have measured significant memory savings in large Kubernetes clusters using OpenTelemetry collectors thanks to this PR.
I didn't touch the number of buckets as I thought it's difficult to know which value fits every use case, but this PR mechanically leads to a bit less buckets (since they are now shared for every log level).
Considering this, do you think this change is a net improvement?
I think this is a good idea, though I'd prefer if it was opt-in, maybe as a setting on the SamplingConfig.