Add some VCL to fastly so docs can be purged by top or 2nd level folder
As documented in https://docs.fastly.com/en/guides/wildcard-purges
It would be nice to be able to purge a whole version of the doc at once from docsubild scripts when updating a symlink, like PURGE /3/ and PURGE /fr/3/ and so on, instead of doing it file by file.
Some thoughts:
Implementation of Surrogate-Keys
It could be done via headers/conditions in the service configuration, though currently the docs fastly configuration is not created from version control (outside of fastly's own internal versioning), so I'm not sure if that's the best way to approach it.
Regardless this is better accomplished by setting Surrogate-Key header values directly on the responses served via the backend.
Since as far as I am aware, there's no trivial way to manage adding HTTP headers with sphinx, I recommend doing this by using nginx add_header directives in the nginx config rather than with VCL.
Access for purging
Currently purges using Surrogate-Keys are only accessible via the authenticated API, so we'd need some mechanism for issuing them when necessary.
What about using BANs with regexes implemented purely in VCL so we can do it without the authenticated API?
https://varnish-cache.org/docs/7.2/users-guide/purging.html
Something like:
if (req.method == "BAN") {
# Same ACL check as above:
if (!client.ip ~ purge) {
return(synth(403, "Not allowed."));
}
# Assumes req.url is a regex. This might be a bit too simple
if (std.ban("obj.http.url ~ " + req.url)) {
return(synth(200, "Ban added"));
} else {
# return ban error in 400 response
return(synth(400, std.ban_error()));
}
}
While we're at it, implementing an IP whitelist for PURGE and BAN should be great to avoid ReDoS attacks.
Surrogate-Key purges aren't exposed via Fastly's config/VCL but through their API, so I don't think that will work.
I almost never used fastly, I just had plain varnish in prod. Is fastly VCL restricted is some way blocking us to play this kind of tricks? :(
This would be useful, if nothing else than to reduce the thousands of lines in the docsbuild logs taken up by PURGE commands.
Sphinx just creates static files so adding headers would need to be done via NGINX it seems, but I'm not too sure of what rule we should use. We purge one version-language at a time, so having a Surrogate-Key of e.g. 3.14/en or 3.14-en etc would be the best.
A
#510 adds Surrogate-Keys from the backend response, but updates to the build scripts and configuration of fastly authentication will need to be implemented for this to be complete.
PURGE commands for surrogate keys must be issued via the Fastly API https://www.fastly.com/documentation/reference/api/purging/#purge-tag
#512 exposes FASTLY_SERVICE_ID and FASTLY_TOKEN that can be used.
And just thinking about rollout... It is likely wise to continue to purge by URL and Surrogate-Key for a week or two before switching to Surrogate-Key only, as responses in cache from before #510's rollout will not have the Surrogate-Keys applied.
I'm not sure my PR worked...
>>> import requests
>>> assert 'Surrogate-Key' not in requests.get('https://docs.python.org/3/').headers
A
The Surrogate-Key header is consumed by Fastly.
Ah, that makes more sense -- thanks.
$ curl -I -H"Host: docs.python.org" https://lb.nyc1.psf.io:443/3/ -k
HTTP/2 200
server: nginx
date: Thu, 10 Oct 2024 16:28:11 GMT
content-type: text/html
content-length: 17141
last-modified: Thu, 10 Oct 2024 12:04:47 GMT
etag: "6707c2df-42f5"
surrogate-key: en/3
accept-ranges: bytes
x-clacks-overhead: GNU Terry Pratchett
strict-transport-security: max-age=315360000; includeSubDomains; preload