sanity
sanity copied to clipboard
CDN not returning 304 when If-Modified-Since is set
Describe the bug
The Sanity CDN doesn't seem to respect or process the If-Modified-Since
header parameter and always returns a 200
response instead of 304
.
To Reproduce
curl --location --head 'https://cdn.sanity.io/images/u7223sxg/production/c6c23e8c434cf24678da0d7b77b06f91aaa2aa88-500x750.jpg' \
--header 'If-Modified-Since: Wed, 04 Aug 2022 14:20:45 GMT'
Expected behavior
The server should respond with a 304
header with no body, but responds with 200
and the full asset in its body
Example (incorrect) response from Sanity's CDN:
HTTP/2 200
content-length: 64426
x-b3-traceid: 9e8ac3d438cfeb50c34dc986fee48db9
x-b3-parentspanid: a69592b7e5ec68c8
x-b3-spanid: 696ff93e39285f17
x-b3-sampled: 0
vary: origin
x-sanity-asset-storage: gcs-default
content-security-policy: script-src 'none'
x-content-type-options: nosniff
xkey: project-u7223sxg-production
x-varnish-age: 2344
accept-ranges: bytes
via: 1.1 google
date: Tue, 02 Aug 2022 17:28:18 GMT
cache-control: public, max-age=31536000, s-maxage=2592000
content-type: image/jpeg
age: 174504
alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Correct request/response from another CDN for reference:
Request:
curl --location --head 'https://upload.wikimedia.org/wikipedia/commons/thumb/d/d1/WWW-LetShare.svg/2560px-WWW-LetShare.svg.png' \
--header 'If-Modified-Since: Wed, 03 Aug 2022 14:20:45 GMT'
Response:
HTTP/2 304
date: Thu, 04 Aug 2022 13:05:06 GMT
content-type: image/png
content-disposition: inline;filename*=UTF-8''WWW-LetShare.svg.png
etag: edfee7d45c92e952ab651e490292d36b
last-modified: Tue, 19 Apr 2022 11:51:59 GMT
server: ATS/8.0.8
age: 17647
x-cache: cp1088 hit, cp1078 miss
x-cache-status: hit-local
server-timing: cache;desc="hit-local", host;desc="cp1078"
strict-transport-security: max-age=106384710; includeSubDomains; preload
report-to: { "group": "wm_nel", "max_age": 86400, "endpoints": [{ "url": "https://intake-logging.wikimedia.org/v1/events?stream=w3c.reportingapi.network_error&schema_uri=/w3c/reportingapi/network_error/1.0.0" }] }
nel: { "report_to": "wm_nel", "max_age": 86400, "failure_fraction": 0.05, "success_fraction": 0.0}
accept-ch: Sec-CH-UA-Arch,Sec-CH-UA-Bitness,Sec-CH-UA-Full-Version-List,Sec-CH-UA-Model,Sec-CH-UA-Platform-Version
permissions-policy: interest-cohort=(),ch-ua-arch=(self "intake-analytics.wikimedia.org"),ch-ua-bitness=(self "intake-analytics.wikimedia.org"),ch-ua-full-version-list=(self "intake-analytics.wikimedia.org"),ch-ua-model=(self "intake-analytics.wikimedia.org"),ch-ua-platform-version=(self "intake-analytics.wikimedia.org")
x-client-ip: 96.250.162.220
access-control-allow-origin: *
access-control-expose-headers: Age, Date, Content-Length, Content-Range, X-Content-Duration, X-Cache
timing-allow-origin: *
Which versions of Sanity are you using?
Not really applicable, but:
@sanity/cli 2.30.0 (latest: 2.30.3)
What operating system are you using?
Mac OS Monterey 12.5 (21G72)
Which versions of Node.js / npm are you running?
8.5.2
v17.7.2
Additional context
- The same behavior can be observed by just loading a Sanity CDN image in Chrome and reloading the page. Sanity's CDN will always return 200, causing the whole image to be loaded, while other CDNs will correctly return
304
headers. - We tested with assets ranging from 200kb to 50mb and are seeing the same behavior across the board
- We reviewed https://www.sanity.io/docs/asset-cdn and https://cloud.google.com/cdn/docs/caching, but haven't found any conclusive fix
Hi Benjamin!
Assets on our CDN are generally considered "immutable", and sends a cache-control header with a very long max-age:
➜ curl -sS --head 'https://cdn.sanity.io/images/u7223sxg/production/c6c23e8c434cf24678da0d7b77b06f91aaa2aa88-500x750.jpg' | grep cache-control
cache-control: public, max-age=31536000, s-maxage=2592000
As such, the browser shouldn't do (or need to do) a conditional request for assets (whether using if-modified-since
or if-none-match
). Here is an example from our sanity.io website, seen from the Chrome developer tools (filtered on cdn.sanity.io
):
As you can see, all of these images are loaded from disk cache, resolving in 8ms or less. Also, note that the status code is listed as 200 even if no request was actually performed.
Is this helpful? Are you seeing the same (disk cache)
or (memory cache)
in the size column? Are you sure the Disable cache
checkbox is not enabled?
Thanks for looking into this, @rexxars !
With that scenario I'm seeing the same in Chrome too now, actually. It looks like what's happening in our case is that caching only works if a client implements and stores max-age
/s-maxage
.
We're working on a headless asset pipeline that currently only supports checks against if-modified-since
and content-length
. That pipeline is purely file-based and checks against local modified dates and file-size. Integrating a full cache that stores HTTP responses, including max-age, wouldn't be impossible but it would require an additional local store for HTTP responses.
It would be a huge plus for us if Sanity's CDN could support parsing and responding to if-modified-since
headers with the proper HTTP code, but your explanation makes sense. FWIW the official Sanity documentation indicates that If-Modified-Since etc should work:
Clients can use standard cache headers such as Cache-Control, If-Modified-Since, If-None-Match, and Accept-Encoding to control cache behavior - for details, see the Google Cloud CDN documentation.
Is there a possibility for this to be implemented on the CDN side or could this at least be captured in the official documentation?
FWIW the official Sanity documentation indicates that If-Modified-Since etc should work
Oh, we should fix that - thanks for reporting!
Is there a possibility for this to be implemented on the CDN side or could this at least be captured in the official documentation?
Implemented on the CDN: Requires some investigation. We want to be careful not to have clients start sending conditional requests when they don't have to, but if we are confident that they won't, I don't see any reason why we shouldn't include a last-modified
or etag
header. I have reached out to the content lake team and asked for their input.
Documentation: most definitely.
You are correct that we are currently missing last-modified headers for our image services. We are working on getting this supported properly for our assets. Don't have a timeline atm, but should hopefully not take too long to get out
Thanks, that would be super helpful to have 🙇
@benjaminbojko The cdn should now return 304 with a matching if-modified-since header.
» curl -si -H "if-modified-since: Thu, 08 Oct 2020 12:00:35 GMT" https://cdn.sanity.io/images/3do82whm/next/98207d99e70275cc8188a4124875c2c0f3c4e034-1600x900.jpg\?rect\=0,50,1600,800\&w\=800\&h\=400\&fit\=clip\&auto\=format | head -n1
HTTP/2 304
Do keep in mind that the CDN is cached, so it might take time for this to reflect on already existing objects. However new assets should work out of the box
@sgulseth amazing! I just tested with our script and it seems to work great 🙌
Thank you so much for reviewing this and applying a fix so quickly. We really enjoy working with Sanity so far and this gave us more confidence in continuing to use it as part of our process.
Great to hear that it worked. Thank you for reporting this! :)