"503 Invalid object for refresh" is very hard to debug
So I've just seen my first failing backend responses with this status in the wild, and I must say that despite the fact that I was involved in the development of the feature, I have NO IDEA what is causing them. Not even varnishadm param.set vsl_mask +ObjProtocol,+ObjStatus,+ObjReason,+ObjHeader helps.
We should have another look at this.
** << BeReq >> 364517
-- Begin bereq 364516 fetch
-- VCL_use vk8s_ing_bsHkUJAaZhBMdU9rFBY9GP8yKrK6OSH5fwsLcH
-- Timestamp Start: 1758715487.960737 0.000000 0.000000
-- BereqMethod GET
-- BereqURL /
-- BereqProtocol HTTP/1.1
-- BereqHeader X-Forwarded-Proto: https
-- BereqHeader Host: ...
...
-- BereqHeader Accept-Encoding: gzip
-- BereqHeader X-Retryable: true
...
-- BereqHeader Accept-Language: de-de
-- BereqHeader X-Varnish: 330382
-- BereqHeader VK8S-Is-Bgfetch: true
-- BereqHeader X-Forwarded-For: ...
-- BereqHeader Via: 1.1 akamai.net(ghost) (AkamaiGHost), 1.1 vk8s_viking-system_viking-service-7c4d7f7c48-8nrkh (Varnish/trunk), 1.1 vk8s_viking-system_viking-service-7c4d7f7c48-jcm6n (Varnish/trunk)
-- BereqHeader If-Modified-Since: Wed, 24 Sep 2025 12:03:15 GMT
-- BereqHeader If-None-Match: W/"bf7d4-ONJkWz0XdYH7/D9cR00VEHFCSYs"
-- BereqHeader X-Varnish: 364517
-- VCL_call BACKEND_FETCH
-- VCL_Log statsd: request.backend.fetch:1|c
-- BereqUnset VK8S-Is-Bgfetch: true
-- BereqHeader VK8S-Is-Bgfetch: false
-- VCL_Log backend_fetch_branch using pool ...
-- BereqUnset X-Forwarded-For: ...
-- VCL_return fetch
-- Timestamp Fetch: 1758715487.960796 0.000059 0.000059
-- Timestamp Connected: 1758715487.961822 0.001084 0.001025
-- BackendOpen 86 vk8s_...-6ffdc98d75-v952h_8080 172.23.34.42 8080 172.23.32.62 42832 connect
-- Timestamp Bereq: 1758715487.961854 0.001117 0.000032
-- BerespProtocol HTTP/1.1
-- BerespStatus 304
-- BerespReason Not Modified
...
-- BerespHeader ETag: W/"bf7d4-ONJkWz0XdYH7/D9cR00VEHFCSYs"
...
-- BerespHeader Last-Modified: Wed, 24 Sep 2025 12:03:15 GMT
-- BerespHeader Date: Wed, 24 Sep 2025 12:04:48 GMT
...
-- Timestamp Beresp: 1758715488.371188 0.410450 0.409333
-- TTL RFC 10 10 0 1758715488 1758715488 1758715488 0 10 cacheable
-- VCL_call BACKEND_REFRESH
-- VCL_return error
-- BackendClose 86 vk8s_...-6ffdc98d75-v952h_8080 close RESP_CLOSE
-- Timestamp Error: 1758715488.371263 0.410525 0.000075
-- BerespProtocol HTTP/1.1
-- BerespStatus 503
-- BerespReason Invalid object for refresh
-- BerespHeader Date: Wed, 24 Sep 2025 12:04:48 GMT
-- BerespHeader Server: Varnish
-- VCL_call BACKEND_ERROR
...
-- VCL_return retry
-- Timestamp Retry: 1758715488.371277 0.410539 0.000014
-- Link bereq 1737363 retry
-- End
If anyone stumbles over this before #4400 gets in:
The workaround should be to add
sub vcl_refresh_valid {
if (!obj_stale.is_valid) {
unset bereq.http.if-modified-since;
unset bereq.http.if-none-match;
return (retry);
}
}
For my own case, I have taken a more drastic approach, because I already have more generic retry code.
sub vcl_backend_fetch {
+ if (bereq.retries > 0) {
+ # avoid "503 invalid object for refresh"
+ unset bereq.http.If-Modified-Since;
+ unset bereq.http.If-None-Match;
+ }
@nigoroll we are now also seeing this error in vcl_backend_error every once in a while (while handling ~1k req/sec) for normal, unsuspicious GETs. Sometimes, the errors happen every 4m - 5m (way longer than the max-age/TTL) and sometimes every 20m to 30m. This only happens with Varnish 8.0.0. Not with 7.7.3. We reverted to 7.7.3 for the time being.
@kaiburjack have you tried any of the workarounds or did you just roll back?
closing because #4400 has been merged (edit: no, this ticket is about this situation being hard to debug)