varnish-cache icon indicating copy to clipboard operation
varnish-cache copied to clipboard

Don't send If-Modified-Since if Last-Modified is a weak validator

Open AlveElde opened this issue 3 years ago • 1 comments

This PR changes backend revalidations to only send If-Modified-Since if the Last-Modified header is at least one second older than the Date header, to prevent revalidating content that changed within the same second.

varnishtest servers now send a Date header by default, unless -nodate is specified. While in the neighborhood, -nolen was adjusted to behave similar to -nohost and -nodate.

RFC9110 states (this is not new, but some of the wording has changed over time):

A Last-Modified time, when used as a validator in a request, is implicitly weak unless it is possible to deduce that it is strong, using the following rules:

(...)

  • The validator is being compared by an intermediate cache to the validator stored in its cache entry for the representation, and
  • That cache entry includes a Date value which is at least one second after the Last-Modified value and the cache has reason to believe that they were generated by the same clock or that there is enough difference between the Last-Modified and Date values to make clock synchronization issues unlikely.

https://www.rfc-editor.org/rfc/rfc9110#section-8.8.2.2-6.2

So Last-Modified is a weak validator, unless the response also contains a Date header that is at least one second more recent than the Last-Modified header. What should we do when we only have a weak validator?

Strong validators are usable for all conditional requests, including cache validation, partial content ranges, and "lost update" avoidance. Weak validators are only usable when the client does not require exact equality with previously obtained representation data, such as when validating a cache entry or limiting a web traversal to recent changes.

https://www.rfc-editor.org/rfc/rfc9110#section-8.8.1-10

The RFC is a bit vague here, but notice that it really only talks about what a client can do with a weak validator, this section of the RFC distinguishes between clients and intermediary caches. So Varnish should not stop a client from using a weak Last-Modified as a validator, since the client may know that it is fine to do so. Varnish does not have this knowlege.

Going back 15 years, we find some discussion on the topic of weak validators:

The reason for this is the abstraction "weak validator" itself. While "validator" is a good abstraction from the details of Last-Modified and Etag, and also "strong validator" is quite clear, this can't work for "weak".

"weak validator" tries do build a common abstraction from two different, completely unrelated kinds of "weakness".

Weak etags: the weakness is not to guarantee byte-equivalence, but they guarantee semantic equivalence. Of course, the server needs some concept of semantic equivalence build in, to use weak etags. (Oh, and it would be fine, if the client would have the same idea about semantics.)

Last-Modified date: the weakness is the limited time resolution. It is unreliable (or not a validator at all), unless it meets some extra conditions. There is no concept of semantic equivalence whatsoever.

https://trac.ietf.org/trac/httpbis/ticket/101

The intent seems to be that a weak ETag indicates semantic equivalence, and can thus be used for cache revalidation. A weak Last-Modified does not indicate semantic equivalence, so we cannot use it for cache revalidation.

AlveElde avatar Sep 19 '22 12:09 AlveElde

FTR, Weak validators can not be used for If-Match requests

nigoroll avatar Sep 19 '22 12:09 nigoroll

@nigoroll can this be merged now?

daghf avatar Nov 08 '22 15:11 daghf

Was this only approved by @nigoroll or during bugwash too? Did we hold the pull request because we just had a release?

dridi avatar Nov 09 '22 13:11 dridi

Rebased on master and squashed a commit

AlveElde avatar Nov 28 '22 14:11 AlveElde