Formalize VCL Private data access via VRT_CTX / VCL line info in panics
As documented in this comment, I was asked to make VCL source line information (as tracked by VPI_count()) available in panics. This would actually be very handy for analysis, and if there was a place for the vpi count in VRT_CTX, we could make VCL_Trace a VCC flag (turning tracing on only if explicitly asked for at compile time), turn VPI_count() into a very efficient inline function and spare a lot of function calls while VCL code is executing.
And so I went down a rabbit role:
- Panics use thread local storage as provided by
pthread_setspecific()to get hold of the panicking thread's context (req, bo etc) - We want use tls sparsely due to the cost involved
- We cannot just track additional information in
VRT_CTX, because it is (for very good reasons)const.
So I looked at the one case where VCL (actually VGC) modifies the CTX - indirectly, which is *ctx->handling. Outside the CLI, it lives in the worker. A simple approach would have been to just add unsigned vpi_count to struct worker and unsigned *vpi_count to struct vrt_ctx. But that would have eaten up half the space I just found wasted in adfac94537004b1b273316e911ca61dfa8960fae and also it appeared to me that it might be about time to give VGC it's own writable bit of VRT_CTX, to not only accommodate the to-be-moved vpi_count, but also to solve the nuisance that evil vmods could fiddle directly with *ctx->handling (this was discussed in the course of #3163).
This PR is an RFC (really!) to discuss
- if we want to go this route
- and, if yes, how
You will find at least one spot which I am not happy about, and we can discuss implementation questions later. At this point, my question really is the above.
I forgot: I recommend to only look at the first commit https://github.com/varnishcache/varnish-cache/pull/3515/commits/f2747392243c75ed668a257642c870a2770f9435
The other changes have been scripted using coccinelle.
FTR: today's bugwash seemed to be generally in favor, Dridi wants to have a closer look still.
With VRT_handled() in, no more reserves.
bugwash: implement vcc_trace differently and move to new PR:
- use debug flag
- add vcl control over it
force-push: Update to remove vcl tracing change
I have polished the initial commit of this PR so I am now happy with it and thus removed the Draft status.
I have also added VCL Source information to panics. Example from 82d502612b67b1040ca8940425754b50462e6fdc :
vpi = 0x7f135e6603b0 {
handling (VCL::return) = 0x0 (none),
ref = 6,
vpi_ref = 0x7f1359ffd920 {
source = 0 (\"<vcl.inline>\"),
offset = 365,
line = 20,
pos = 25,
token = vtc
},
},
I have not yet opened a PR for the reworked vcl tracing because it builds on this PR. It is available here: https://github.com/nigoroll/varnish-cache/commit/75a30d6391da16edf65dcbce8034f82d57800ccc
IRC notes
- in panic, dump the VCL code line instead of "token", if possible
- @bsdphk questions if the VCL line info for panic can be achieved more simply and
- if avoiding the
VPI_count()has any measurable benefit.
FYI we have a new private facility for workers: 525fe180846877dd3d36e7ca062755a756000c29...9ed99f61ebecb8471e87f3d6bdfde34b9937929d
I was asked to provide some numbers on the relevance of the performance improvement enabled by this PR. This PR itself does not aim at performance improvements, but it is the basis for #3836 .
Obviously, the expected gains are higher the more VCL code gets executed.
max potential
Here are results for a testcase designed to emphasize the impact of the change, which I would consider to come close to the maximum potential. That said, real world impact is likely to be less significant. The tables show quantiles of process times in seconds for the tests laid out in detail below:
master a9ac9f0103b7bc2ab4a7059c9cda8495faf5b3bf
| 10% | 20% | 30% | 40% | 50% | 60% | 70% | 80% | 90% |
|---|---|---|---|---|---|---|---|---|
| 0.000477 | 0.000488 | 0.000496 | 0.000499 | 0.000503 | 0.000506 | 0.000512 | 0.000521 | 0.000545 |
https://github.com/nigoroll/varnish-cache/commit/54dc0992f8365f197a1314e86ed0be5c3ae38709
| 10% | 20% | 30% | 40% | 50% | 60% | 70% | 80% | 90% |
|---|---|---|---|---|---|---|---|---|
| 0.000270 | 0.000278 | 0.000284 | 0.000287 | 0.000290 | 0.000294 | 0.000298 | 0.000303 | 0.000313 |
min potential
These are results for a very basic case which does not favor the advantage of the change.
master a9ac9f0103b7bc2ab4a7059c9cda8495faf5b3bf
| 10% | 20% | 30% | 40% | 50% | 60% | 70% | 80% | 90% |
|---|---|---|---|---|---|---|---|---|
| 2.3e-05 | 2.5e-05 | 2.5e-05 | 2.5e-05 | 2.6e-05 | 2.7e-05 | 2.7e-05 | 2.8e-05 | 3.0e-05 |
https://github.com/nigoroll/varnish-cache/commit/54dc0992f8365f197a1314e86ed0be5c3ae38709
| 10% | 20% | 30% | 40% | 50% | 60% | 70% | 80% | 90% |
|---|---|---|---|---|---|---|---|---|
| 2.2e-05 | 2.4e-05 | 2.4e-05 | 2.5e-05 | 2.5e-05 | 2.6e-05 | 2.6e-05 | 2.7e-05 | 2.9e-05 |
Details
methodology
The script below was run with the test case vcls given below on the code bases to compare. The script is to be run from a varnish-cache source tree and assumes varnish-cache to be installed under VHOME set in the script
All VSL tags are masked except for Timestamp, Begin, End and Link
#!/bin/bash
typeset -r VHOME=/tmp
name="$1"
if [[ -z "${name}" ]] ; then
echo >&2 "need name argument"
exit 9
fi
set -eu
tags=($(awk < ./doc/sphinx/include/vsl-tags.rst '/^[A-Z]/ { print $1 }'| grep -vE 'Timestamp|Begin|End|Link'))
oIFS="${IFS}"
IFS=,;vsl="${tags[*]/#/-}"
IFS="${oIFS}"
pkill varnishlog || true
pkill varnishd || true
while pgrep varnishd ; do sleep 1 ; done
"${VHOME}"/sbin/varnishd \
-f $PWD/bench_vcl_trace.vcl \
-p vsl_mask="${vsl}" \
-a 127.0.0.1:8080
# prime
echo priming
ab -n 10000 -c 1000 http://localhost:8080/ >/dev/null 2>&1
"${VHOME}"/bin/varnishlog -w "${name}".vsl &
vlog=$!
echo running
ab -n 100000 -c 1000 http://localhost:8080/ 2>&1 | tee "${name}".ab.txt
kill ${vlog}
"${VHOME}"/bin/varnishlog -g raw -r "${name}".vsl -I Timestamp:Process|awk '{print $NF;}' >"${name}".times.txt
R -q -e 'data <- as.numeric(readLines("'"${name}".times.txt'"));quantile(data, probs = seq(.1, .9, by = .1))' \
>"${name}".quantiles.txt
max potential test case
vcl:
vcl 4.1;
backend proforma none;
sub vcl_recv {
return (synth(200));
}
sub vcl_synth {
if (true) { set resp.status = 200; }
# repeat ^^^ 999 x
return (deliver);
}
min potential test case
vcl:
vcl 4.1;
backend proforma none;
sub vcl_recv {
return (synth(200));
}
sub vcl_synth {
return (deliver);
}
OKed by bugwash
It would have been nice to avoid merging purposefully a broken commit a8483cf5c864a87f59b2cc26315f37e9edda2d97, this is going to break git-bisect for no good reason.
@Dridi you have a point, sorry.
I took some time to look at git-bisect and found that we can skip broken commits with a special exit status 125, so I figured we could use that in the vtc-bisect.sh script.
It turns out broken commits were mitigated since day 1 so this situation is not as bad as I thought. Also, the probability to land on this specific commit is not 1.
One of those cases I'm not used to, where my past self didn't lay a trap for his future self...