varnish-cache icon indicating copy to clipboard operation
varnish-cache copied to clipboard

RFC: ESI vs future gzip VDP

Open nigoroll opened this issue 5 years ago • 4 comments

#3529 makes me want to come back to an issue which has bothered me for some time and now that the VDP API groundwork has been laid, a gzip vdp should become possible. Let us assume for a moment that it existed, then the idea to ESI-include VDP-processed, gzipped content won't be far:

diff --git a/bin/varnishtest/tests/e00026.vtc b/bin/varnishtest/tests/e00026.vtc
index 7fa85d0cf..6a34864d3 100644
--- a/bin/varnishtest/tests/e00026.vtc
+++ b/bin/varnishtest/tests/e00026.vtc
@@ -36,6 +36,8 @@ server s1 {
 } -start
 
 varnish v1 -vcl+backend {
+       import debug;
+
        sub vcl_backend_response {
                if (bereq.url ~ "^/.$") {
                        set beresp.do_esi = true;
@@ -48,6 +50,9 @@ varnish v1 -vcl+backend {
                if (req.esi_level > 0 && req.http.r3529) {
                        unset req.http.Accept-Encoding;
                }
+               if (req.esi_level > 0 && req.http.rot13) {
+                       set resp.filters = "gunzip rot13 gzip";
+               }
                set resp.http.filters = resp.filters;
        }
 } -start

Note that gunzip rot13 is possible now since d2be2bfc0f59ef79cd8efa4720ace4e49e55dcb6 and will lead to the pretendgz VDP to DTRT when used from a gzipped esi context.

I would be interested in opinions on how we should approach this. In my mind, a hypothetical gzip VDP could be ESI aware and update the parent ESI context's gzip bits appropriately if it is the last VDP in the filter chain. Alternatively, it could provide gzip bits to gzgz, which would require some "out of band" data to be made available between VDPs.

I also wonder if filters should learn about encodings: The gunzip filter would specify that it takes "clear" and outputs "gzip", gzip the inverse. The _init function of a filter would then be given the chance to check if it can process the previous filter's output.

nigoroll avatar Feb 20 '21 15:02 nigoroll

Wouldn't it make more sense to give pretendgz the ability to actually gzip ?

bsdphk avatar Feb 22 '21 10:02 bsdphk

I would not want to remove a pretend gzip option, but, yes, giving a choice of compression level (e.g. 0 for pretend to 9 for best) would be nice. I am not sure if it made more sense to have a generic gzip compressor which could talk to gzgz, or separate VDPs for esi vs. non-esi?

nigoroll avatar Feb 22 '21 10:02 nigoroll

What worries me here is the "reaching across layers" aspect of the necessary communication...

(The 'pretend_gzip' is identical to what happens when you give compression level 0.)

bsdphk avatar Feb 22 '21 12:02 bsdphk

This is exactly what worries me too, which is why I wonder if we should make this kind of oob information part of the filters api(s)

nigoroll avatar Feb 22 '21 13:02 nigoroll

After closer inspection I am closing this ticket.

The original decision to not spin up a VGZ to compress an uncompressed ESI-included subrequest remains sound, it allows already compressed data to not be compressed twice.

If you want to compress ESI included, uncompressed objects, gzip them when you fetch them with the VFP gzip filter.

bsdphk avatar Aug 15 '22 08:08 bsdphk