incubator-pagespeed-ngx
incubator-pagespeed-ngx copied to clipboard
Clarification about downstream caching documentation
At https://www.modpagespeed.com/doc/downstream-caching#standard
in the varnish example config the following lines added:
# Mark HTML as uncacheable. If we can't send them purge requests they can't
# cache our html.
sub vcl_backend_response {
if (beresp.http.Content-Type ~ "text/html") {
unset beresp.http.Cache-Control;
set beresp.http.Cache-Control = "no-cache, max-age=0";
}
return (deliver);
}
which is a bit confusing to me, because as I understand (please correct me if I'm wrong) the whole point of this section is to have cache enabled by allowing pagespeed to purge varnish when needed when a portion of hit/miss requests performed with beacons. But from these lines we still have no cache for text/html, also good varnish configuration would respect no-cache header from backend anyway
Also, it's not clear how the html cache would be enabled on pagespeed side, does specifying DownstreamCachePurgeLocationPrefix and DownstreamCacheRebeaconingKey options will signal to pagespeed to not set no-cache; max-age=0 for html pages? It's not obvious.
I think the cache-control is overridden to make sure no-one downstream caches it.
Varnish will cache it though, and reply with cached html responses most of the time. (A small sample set of traffic will be forced cache-miss randomly, which allows pagespeed to resynthesize the response every now and then and send intrumentation beacons to the browsers).
Thank you for your comment.
I think the cache-control is overridden to make sure no-one downstream caches it.
Do you mean another caching reverse proxy before varnish? A very specific thing to have in the standard configuration.
Varnish will cache it though
From what I know the good practice is when Varnish does consider cache headers from backend responses and usually, Varnish config has something like this:
if (beresp.http.Cache-Control ~ "no-cache") {
set beresp.http.X-VC-Cacheable = "NO:Cache-Control=no-cache";
set beresp.uncacheable = true;
set beresp.ttl = 120s;
}
I'm happy to open PR to improve the documentation or add some clarifications if you think they're needed.
I meant to say we wouldn't want an intermediary cdn/proxy/browser to cache the html, because we can't purge those. Having said that, if you see room for doc improvements, PR's are welcome!