As documented in this comment, I was asked to make VCL source line information (as tracked by VPI_count()) available in panics. This would actually be very handy for analysis, and if there was a place for the vpi count in VRT_CTX, we could make VCL_Trace a VCC flag (turning tracing on only if explicitly asked for at compile time), turn VPI_count() into a very efficient inline function and spare a lot of function calls while VCL code is executing.

And so I went down a rabbit role:

Panics use thread local storage as provided by pthread_setspecific() to get hold of the panicking thread's context (req, bo etc)
We want use tls sparsely due to the cost involved
We cannot just track additional information in VRT_CTX, because it is (for very good reasons) const.

So I looked at the one case where VCL (actually VGC) modifies the CTX - indirectly, which is *ctx->handling. Outside the CLI, it lives in the worker. A simple approach would have been to just add unsigned vpi_count to struct worker and unsigned *vpi_count to struct vrt_ctx. But that would have eaten up half the space I just found wasted in adfac94537004b1b273316e911ca61dfa8960fae and also it appeared to me that it might be about time to give VGC it's own writable bit of VRT_CTX, to not only accommodate the to-be-moved vpi_count, but also to solve the nuisance that evil vmods could fiddle directly with *ctx->handling (this was discussed in the course of #3163).

This PR is an RFC (really!) to discuss

if we want to go this route
and, if yes, how

You will find at least one spot which I am not happy about, and we can discuss implementation questions later. At this point, my question really is the above.

Feb 01 '21 19:02 nigoroll

I forgot: I recommend to only look at the first commit https://github.com/varnishcache/varnish-cache/pull/3515/commits/f2747392243c75ed668a257642c870a2770f9435

The other changes have been scripted using coccinelle.

Feb 01 '21 19:02 nigoroll

FTR: today's bugwash seemed to be generally in favor, Dridi wants to have a closer look still.

Feb 08 '21 18:02 nigoroll

With VRT_handled() in, no more reserves.

Feb 10 '21 11:02 dridi

bugwash: implement vcc_trace differently and move to new PR:

use debug flag
add vcl control over it

Feb 15 '21 15:02 nigoroll

force-push: Update to remove vcl tracing change

I have polished the initial commit of this PR so I am now happy with it and thus removed the Draft status.

I have also added VCL Source information to panics. Example from 82d502612b67b1040ca8940425754b50462e6fdc :

    vpi = 0x7f135e6603b0 {
      handling (VCL::return) = 0x0 (none),
      ref = 6,
      vpi_ref = 0x7f1359ffd920 {
        source = 0 (\"<vcl.inline>\"),
        offset = 365,
        line = 20,
        pos = 25,
        token = vtc
      },
    },

I have not yet opened a PR for the reworked vcl tracing because it builds on this PR. It is available here: https://github.com/nigoroll/varnish-cache/commit/75a30d6391da16edf65dcbce8034f82d57800ccc

Feb 15 '21 17:02 nigoroll

IRC notes

in panic, dump the VCL code line instead of "token", if possible
@bsdphk questions if the VCL line info for panic can be achieved more simply and
if avoiding the VPI_count() has any measurable benefit.

Mar 29 '21 15:03 nigoroll

FYI we have a new private facility for workers: 525fe180846877dd3d36e7ca062755a756000c29...9ed99f61ebecb8471e87f3d6bdfde34b9937929d

May 05 '21 18:05 dridi

I was asked to provide some numbers on the relevance of the performance improvement enabled by this PR. This PR itself does not aim at performance improvements, but it is the basis for #3836 .

Obviously, the expected gains are higher the more VCL code gets executed.

max potential

Here are results for a testcase designed to emphasize the impact of the change, which I would consider to come close to the maximum potential. That said, real world impact is likely to be less significant. The tables show quantiles of process times in seconds for the tests laid out in detail below:

master a9ac9f0103b7bc2ab4a7059c9cda8495faf5b3bf

10%	20%	30%	40%	50%	60%	70%	80%	90%
0.000477	0.000488	0.000496	0.000499	0.000503	0.000506	0.000512	0.000521	0.000545

https://github.com/nigoroll/varnish-cache/commit/54dc0992f8365f197a1314e86ed0be5c3ae38709

10%	20%	30%	40%	50%	60%	70%	80%	90%
0.000270	0.000278	0.000284	0.000287	0.000290	0.000294	0.000298	0.000303	0.000313

min potential

These are results for a very basic case which does not favor the advantage of the change.

master a9ac9f0103b7bc2ab4a7059c9cda8495faf5b3bf

10%	20%	30%	40%	50%	60%	70%	80%	90%
2.3e-05	2.5e-05	2.5e-05	2.5e-05	2.6e-05	2.7e-05	2.7e-05	2.8e-05	3.0e-05

https://github.com/nigoroll/varnish-cache/commit/54dc0992f8365f197a1314e86ed0be5c3ae38709

10%	20%	30%	40%	50%	60%	70%	80%	90%
2.2e-05	2.4e-05	2.4e-05	2.5e-05	2.5e-05	2.6e-05	2.6e-05	2.7e-05	2.9e-05

Details

methodology

The script below was run with the test case vcls given below on the code bases to compare. The script is to be run from a varnish-cache source tree and assumes varnish-cache to be installed under VHOME set in the script

All VSL tags are masked except for Timestamp, Begin, End and Link

#!/bin/bash

typeset -r VHOME=/tmp

name="$1"

if [[ -z "${name}" ]] ; then
	echo >&2 "need name argument"
	exit 9
fi

set -eu

tags=($(awk < ./doc/sphinx/include/vsl-tags.rst '/^[A-Z]/ { print $1 }'| grep -vE 'Timestamp|Begin|End|Link'))
oIFS="${IFS}"
IFS=,;vsl="${tags[*]/#/-}"
IFS="${oIFS}"
pkill varnishlog || true
pkill varnishd || true
while pgrep varnishd ; do sleep 1 ; done
"${VHOME}"/sbin/varnishd \
	-f $PWD/bench_vcl_trace.vcl \
	-p vsl_mask="${vsl}" \
	-a 127.0.0.1:8080
# prime
echo priming
ab -n 10000 -c 1000 http://localhost:8080/ >/dev/null 2>&1
"${VHOME}"/bin/varnishlog -w "${name}".vsl &
vlog=$!
echo running
ab -n 100000 -c 1000 http://localhost:8080/ 2>&1 | tee "${name}".ab.txt
kill ${vlog}
"${VHOME}"/bin/varnishlog -g raw -r "${name}".vsl -I Timestamp:Process|awk '{print $NF;}' >"${name}".times.txt
R -q -e 'data <- as.numeric(readLines("'"${name}".times.txt'"));quantile(data, probs = seq(.1, .9, by = .1))' \
	>"${name}".quantiles.txt

max potential test case

vcl:

vcl 4.1;

backend proforma none;

sub vcl_recv {
    return (synth(200));
}

sub vcl_synth {
    if (true) { set resp.status = 200; }
    # repeat ^^^ 999 x
    return (deliver);
}

min potential test case

vcl:

vcl 4.1;

backend proforma none;

sub vcl_recv {
    return (synth(200));
}

sub vcl_synth {
    return (deliver);
}

May 07 '21 17:05 nigoroll

OKed by bugwash

Aug 15 '22 13:08 nigoroll

It would have been nice to avoid merging purposefully a broken commit a8483cf5c864a87f59b2cc26315f37e9edda2d97, this is going to break git-bisect for no good reason.

Aug 16 '22 07:08 dridi

@Dridi you have a point, sorry.

Aug 16 '22 13:08 nigoroll

I took some time to look at git-bisect and found that we can skip broken commits with a special exit status 125, so I figured we could use that in the vtc-bisect.sh script.

It turns out broken commits were mitigated since day 1 so this situation is not as bad as I thought. Also, the probability to land on this specific commit is not 1.

One of those cases I'm not used to, where my past self didn't lay a trap for his future self...

Aug 26 '22 11:08 dridi

Formalize VCL Private data access via VRT_CTX / VCL line info in panics

max potential

min potential

Details

methodology

max potential test case

min potential test case