[Bug]: Incorrect vary caching header
Requirements
- [x] Is this a bug report? For questions or discussions use https://lemmy.ml/c/lemmy_support or the matrix chat.
- [x] Did you check to see if this issue already exists?
- [x] Is this only a single bug? Do not put multiple bugs in one issue.
- [x] Do you agree to follow the rules in our Code of Conduct?
- [x] Is this a backend issue? Use the lemmy-ui repo for UI / frontend issues.
Summary
Lemmy has various endpoints used for AP requests and also by browsers.
Some of these endpoints have overlapping URLs and are also considered cacheable, including headers like cache-control: public, max-age=60.
Lemmy does not currently include the accept header in the list of headers returned in the Vary header, which would inform caches to treat requests by browsers different form requests by ActivityPub clients.
This can lead to cache confusion, where a cache server may serve HTML to ActivityPub clients or activities to web browsers.
Steps to Reproduce
- Set up Lemmy with a cache in front of it
- Issue request without ActivityPub
acceptheader to prime cache - Issue request with ActivityPub
acceptheader - See HTML returned
Technical Details
https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Vary https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/Caching#vary
Some caches, most prominently Cloudflare, do not support the Vary header, which means that these overlapping URLs are not cacheable by those caches.
curl -v -H 'accept: application/activity+json, application/ld+json' -o /dev/null https://lemmy.ml
> GET / HTTP/2
> Host: lemmy.ml
> accept: application/activity+json, application/ld+json
...
< HTTP/2 200
< content-type: application/activity+json
< vary: Origin, Access-Control-Request-Method, Access-Control-Request-Headers
< cache-control: public, max-age=60
Related:
- https://github.com/LemmyNet/lemmy-ui/issues/3099
- https://github.com/LemmyNet/lemmy/issues/5633
- https://github.com/LemmyNet/lemmy-ui/issues/3100
Version
0.19.11
Lemmy Instance URL
No response
So this affects most of the federation routes which need the Vary header added. Cache-control: public is set in the session middleware, only if there was no previous cache-control header set. So it would work to set it private in each federation route.