varnish-cache icon indicating copy to clipboard operation
varnish-cache copied to clipboard

Add (be)req.trace and feature +trace to control generation of VCL_trace logs

Open nigoroll opened this issue 3 years ago • 0 comments

This PR is the implementation of the bugwash agreement in https://github.com/varnishcache/varnish-cache/pull/3836#issuecomment-1230335147 I am opening a new PR to properly document the performance implications of the alternative approaches. I ran the max potential benchmark from #3515 again on master, #3836, this branch, and one with branch predictions enabled. The results are given below as quantiles of Timestamp:Process times for 100000 requests:

My conclusion is that, while the tracing calls (not) compiled into VGC is still the fastest option, the runtime check from this PR is still substantially faster than the original code from master. If we think that a runtime tracing switch is an important feature (which we agreed upon), then this PR is the way to go.

master 886ab87c87bc7990be3c316feea1c1a117c5365d

10% 20% 30% 40% 50% 60% 70% 80% 90%
0.000383 0.000424 0.000442 0.000454 0.000461 0.000469 0.000477 0.000488 0.000515

this PR (be)req.trace with runtime check, no branch prediction hints https://github.com/nigoroll/varnish-cache/commit/0b41933926fc502872aa625086e93a9fd3ee9e23

10% 20% 30% 40% 50% 60% 70% 80% 90%
0.000201 0.000222 0.000233 0.000241 0.000247 0.000252 0.000258 0.000266 0.000280

this PR (be)req.trace with runtime check and branch prediction hints https://github.com/nigoroll/varnish-cache/commit/165db63002c455ef355e07d30ea40c515137bb46

10% 20% 30% 40% 50% 60% 70% 80% 90%
0.000198 0.000219 0.000230 0.000238 0.000244 0.000250 0.000256 0.000264 0.000278

PR #3836 vcc "hardcoded" tracing code https://github.com/nigoroll/varnish-cache/8e6f877f0053e42a8bc8eb0129d7aae21db2aed3

10% 20% 30% 40% 50% 60% 70% 80% 90%
0.000189 0.000209 0.000222 0.000230 0.000236 0.000242 0.000248 0.000256 0.000271

description from commit message

To potentially emit VCL_trace VSL records, we called VPI_count() from VGC for, in the extreme, every line of VCL.

For a normal setup, VPI_count() would call into VSL only to find out that VCL_trace is masked and not have any effect.

Issuing two additional function calls for each line of VCL obviously is something we should avoid unless there is a real benefit.

To avoid the overhead, we move the check for the tracing flag to VGC and decorate it with UNLIKELY(), which has no effect for now but is expected to become a branch prediction hint soon.

This should bring the cost down close to zero.

Being at it, we also add VCL control over tracing through req.trace and bereq.trace. req.trace gets initialized from feature trace, and bereq.trace inherits the value from req.trace. For now, tracing of the housekeeping subs vcl_init and vcl_fini can only be controlled through feature trace.

nigoroll avatar Aug 30 '22 12:08 nigoroll