varnish-cache on removing content-length for 0-length POST requests

Expected Behavior

POST requests should be exempt from content-length stripping, maybe.

Current Behavior

POST requests with an empty body are stripped from their content-length header

Possible Solution

No response

Steps to Reproduce (for bugs)

varnishtest "POST"

server s1 {
	rxreq
	txresp
	expect req.http.content-length == 0
} -start

varnish v1 -vcl+backend {

} -start

client c1 {
	txreq -method POST -hdr "content-length: 0"
	rxresp
} -run

server s1 -wait

Context

this is a follow-up on #4228, I'm not sure whether what I'm seeing here is an intended consequence of that fix.

Yvan on discord reported that lighttpd replies with a 411 if there's no content-length on a POST request, and that header is now read-only in VCL, so they can't fix it there.

Question is, who's at fault here? I would argue that Varnish is, but I'm not sure that's why I'm opening this ticket.

https://datatracker.ietf.org/doc/html/rfc7230#section-3.3.2 says:

A user agent SHOULD send a Content-Length in a request message when no Transfer-Encoding is sent and the request method defines a meaning for an enclosed payload body

POST qualifies here.

For example, a Content-Length header field is normally sent in a POST request even when the value is 0 (indicating an empty payload body).

A user agent SHOULD NOT send a Content-Length header field when the request message does not contain a payload body and the method semantics do not anticipate such a body.

A POST usually expects a body, so I think it's fair to send a content-length.

Varnish Cache version

varnishd (varnish-7.7.1 revision 2e8180f788715e5bc44df08479d60c9435d79bdd)

Operating system

arch, btw

Source of binary packages used (if any)

No response

May 15 '25 16:05 gquintard

@gquintard I think your analysis is correct: we should send POST requests with Content-Length: 0. In this case, a simple option would be to check the http method and act accordingly.

BUT, and this is important to me, I think we should really ask ourselves if taking away VCL control over Content-Length was the right decision. I understand the good intention, but good intentions don't make it right. Yes, there is a risk of users shooting themselves in the foot when fiddling with framing, but the VCL philosophy has long been that users SHOULD be able to explicitly shoot themselves in the foot, instead of preventing them from doing wrong things they might need to do, or, in this case, working around our mistakes from VCL. Removing the power of VCL will, I believe, kill Varnish-Cache. I would be curious about @dridi's opinion, but I really tend towards adding a feature flag to bring back control over framing headers, disabled by default.

Because otherwise, as discussed here, here and here, we get into territory where modifying headers needs involved solutions like filters.

So, to summarize, we should allow VCL to modify C-L again, guarded by a flag.

May 16 '25 11:05 nigoroll

I'm quoting the same paragraph from RFC9110 because RFC7230 is obsolete:

A user agent SHOULD send Content-Length in a request when the method defines a meaning for enclosed content and it is not sending Transfer-Encoding. For example, a user agent normally sends Content-Length in a POST request even when the value is 0 (indicating empty content). A user agent SHOULD NOT send a Content-Length header field when the request message does not contain content and the method semantics do not anticipate such data.

That tells me that we can technically always pass a content-length: 0 to backends, still be correct, and avoid this pitfall.

Instead of targeting PUT and POST specifically and risking not upholding the "SHOULD" condition in the face of false-negatives like you did in #4332, I would rather uphold the "SHOULD NOT" condition when we can confirm that a method does not anticipate data.

In other words, only strip content-length: 0 when we are certain that we can omit it, and otherwise leave it, with the harmless risk of having the header on unknown methods not anticipating data.

That would be the safe default that allows us to retain our restriction on framing headers. I still think to this day that we should lock framing headers down, and the consensus back then was that users still had the option of using a VMOD if fiddling with framing was a requirement (not necessarily filters).

May 16 '25 12:05 dridi

I am not sure if negating the logic buys us much, but I would not oppose the suggestion. Just please be aware of cases where people send bodies with GET.

My concern with the negated logic is that the number of methods not requiring a body is potentially much larger than the number of methods which do. And yes, I forgot PATCH.

Whatever we put into core code will be wrong for some people, hence my push for bringing back VCL control.

and the consensus back then was that users still had the option of using a VMOD if fiddling with framing was a requirement (not necessarily filters).

Our consensus was not well informed: Filters are the only option for many cases, where core code changes C-L after VCL has finished.

May 16 '25 13:05 nigoroll

Just please be aware of cases where people send bodies with GET

I'm aware, and I'm only talking about stripping content-length for backend requests without a body (whether absent or unset for example by the built-in VCL).

I also had a quick look at the lighttpd code, and found that the latest commit on request.c touches upon this very topic:

https://git.lighttpd.net/lighttpd/lighttpd1.4/commit/71c378217c91eaf466bf1d830e15138cc66c02c6

Looking at the code further down I see how lighttpd treats the content-length+transfer-encoding differently depending on the HTTP_PARSEOPT_HEADER_STRICT flag. I would argue to the original bug reporter on discord that in addition to us improving our handling of zero-length backend request bodies they should report a bug to lighttpd to relax this SHOULD handling of the RFC, and keep the failure only in strict mode. Based on the commit message though, it sounds like an uphill battle. I'm also generally surprised that the handling is purposefully stricter than what the RFC states, and yet it's limited to a single HTTP method.

Circling back to GET with a body, what about this?

GET / HTTP/1.1
Host: example.elastic
Content-Type: text/plain
Content-Length: 0

Should we in this case leave the content-length header? Semantically, there is data associated with the request, the representation just happens to be empty. However RFC9110 does not link content-type to content-length...

May 16 '25 15:05 dridi

I'm aware, and I'm only talking about stripping content-length for backend requests without a body (whether absent or unset for example by the built-in VCL).

Yes, and then later on you seem to realize that sending Content-Length: 0 with a GET may (I have not checked) be relevant.

Again, my proposal is: Let's have a good default, either #4332 or an alternative suggestion with negated logic. Then let's gate the protected headers behind a feature flag to bring back full VCL control including the ability to shoot yourself in the foot. This will probably also require some change in processing order in core code, to avoid cases where VCL can not currently override headers.

May 17 '25 10:05 nigoroll

Yes, and then later on you seem to realize that sending Content-Length: 0 with a GET may (I have not checked) be relevant.

Not really realizing, and rather wondering. At least RFC9110 doesn't seem to make any implication so the SHOULD/SHOULD NOT on content-length:0 avoidance stands on its own.

May 19 '25 05:05 dridi

Idea from bugwash in light of https://github.com/varnishcache/varnish-cache/pull/4247#issuecomment-2640309332: What about VCL control over Content-Length: 0 only? That is, if there is no Content-Length: 0, VCL can add exactly that, if there is one, VCL can remove it. But VCL can never add or remove any other value.

May 19 '25 13:05 nigoroll

When we locked the framing headers down, I made the suggestion that we should add new variables for the content length (foo.body.length or foo.bodylen) where -1 would mean an unknown length (chunked encoding for HTTP/1.1, EOF for HTTP/1.0, missing HEADERS[END_STREAM] flag for h2). These new variables, having the type INT would become much more useful than a combination of vmod_std() with logic to find in which bucket we fall.

Why am I bringing this up? Because that "set or unset content-length:0" suggestion would result in a lot build of infrastructure changes. So instead we should consider adding new variables, especially if we can only set or unset exactly one value (aka a boolean). Reaching this "new variable" conclusion reminded me of my past point on exhibiting the expected body length in VCL, so see the previous paragraph as a drive-by comment. :]

Anyway, to materialize the bugwash suggestion, we could add something like this to the built-in VCL:

--- bin/varnishd/builtin.vcl
+++ bin/varnishd/builtin.vcl
@@ -199,11 +199,26 @@ sub vcl_backend_fetch {
 }
 
 sub vcl_builtin_backend_fetch {
+       call vcl_builtin_bereq_body;
+       call vcl_builtin_bereq_length;
+}
+
+sub vcl_builtin_bereq_body {
        if (bereq.method == "GET") {
                unset bereq.body;
        }
 }
 
+sub vcl_builtin_bereq_length {
+       if (req.method == "GET" ||
+           req.method == "HEAD" ||
+           req.method == "TRACE" ||
+           req.method == "OPTIONS" ||
+           req.method == "DELETE")
+               set bereq.length_needed = false;
+       }
+}
+
 sub vcl_backend_response {
        call vcl_builtin_backend_response;
        return (deliver);

There is a whack-a-mole aspect to this, but 1) I don't anticipate other special cases and 2) that would at least satisfy my framing headers lock down criteria. At least it would become visible in the built-in VCL and ideally solve this case without further intervention than upgrading.

May 19 '25 14:05 dridi

Hmmm...

https://github.com/varnishcache/varnish-cache/pull/4247#issuecomment-2640309332 is about controlling resp.http.c-l, and in this cases needed would not quite match, we would want something like a resp.send_length flag or similar. But this case alone already contradicts 1) I don't anticipate other special cases.

I still remember something @bsdphk must have said a long time ago along the lines of We don't want squid-like flag-ism, we have VCL which is a program controlling what happens and I still get an allergic reaction when I fear flag-isms to creep in. Being able to (un)set Content-Length: 0 (only) would, in my mind, achieve the same as the flag, but be simpler and I would expect to be able to remember that.

May 19 '25 15:05 nigoroll

#4247 (comment) is about controlling resp.http.c-l, and in this cases needed would not quite match, we would want something like a resp.send_length flag or similar. But this case alone already contradicts 1) I don't anticipate other special cases.

Not a contradiction, I covered this ticket in the built-in VCL with my suggested change to the built-in VCL, but I also said this:

So instead we should consider adding new variables

Plural, that included resp.length_needed (but I didn't check what the RFC had to say about content-length in a response).

Having the ability to only set or unset a single content-length:0 value is just creating an indirect flag that is a worse case of flag-ism to me.

May 19 '25 15:05 dridi

Having the ability to only set or unset a single content-length:0 value is just creating an indirect flag that is a worse case of flag-ism to me.

I agree to some extent. It is a suggestion for a compromise. I would still think we should have a way to shoot yourself in the foot, but if we can't have that, I would prefer to not introduce additional syntax.

May 19 '25 16:05 nigoroll

For unsafe constructs, we have VMODs. Framing headers are inherently unsafe, and we shouldn't offer a footgun to someone without a permit.

May 19 '25 16:05 dridi

https://www.rfc-editor.org/rfc/rfc9110#name-411-length-required

A server may send 411 whenever it pleases, so please prefer to send Content-Length: 0 (or HTTP/1.1 Transfer-Encoding: chunked if you must) except when certain the omission of Content-Length: 0 in HTTP requests is widely accepted, e.g. GET and HEAD. (There are different guidelines for omission on HTTP responses, e.g. HTTP response status 1xx or HTTP response status 204 or HTTP response status 304)

@dridi wrote (2 weeks ago) https://github.com/varnishcache/varnish-cache/issues/4331#issuecomment-2886975573

I also had a quick look at the lighttpd code, and found that the latest commit on request.c touches upon this very topic:

https://git.lighttpd.net/lighttpd/lighttpd1.4/commit/71c378217c91eaf466bf1d830e15138cc66c02c6

Looking at the code further down I see how lighttpd treats the content-length+transfer-encoding differently depending on the HTTP_PARSEOPT_HEADER_STRICT flag. I would argue to the original bug reporter on discord that in addition to us improving our handling of zero-length backend request bodies they should report a bug to lighttpd to relax this SHOULD handling of the RFC, and keep the failure only in strict mode. Based on the commit message though, it sounds like an uphill battle. I'm also generally surprised that the handling is purposefully stricter than what the RFC states, and yet it's limited to a single HTTP method.

The reason lighttpd rejects missing Content-Length with POST is as a defense against request smuggling and request splitting attacks, as noted in the commit message you linked above. There is a longer discussion in https://redmine.lighttpd.net/issues/3273 but the resulting change made is that in lighttpd 1.4.78 and later, POST without Content-Length (and without Transfer-Encoding) is rejected with 411 only for HTTP/1.x, and not for HTTP/2 which has message framing.

https://git.lighttpd.net/lighttpd/lighttpd1.4/commit/71c378217c91eaf466bf1d830e15138cc66c02c6

    [core] allow POST w/o Content-Length for HTTP/2 (#3273)
    
    HTTP/2 framing delineates request headers, body, and trailers.
    
    This differs from HTTP/1.x where combinations of Content-Length
    and/or Transfer-Encoding: chunked might be used in request smuggling
    or request splitting attacks.
    
    lighttpd has rejected POST without Content-Length (and without
    Transfer-Encoding: chunked) since the very first version of lighttpd
    1.4, and that restriction is being preserved for now, even if stricter
    than RFC requirements.  Note: some other servers might interpret
    HTTP/1.0 requests with missing Content-Length to mean read body until
    EOF, and others may interpret that scenario as Content-Length: 0, and
    the inconsistency is potentially dangerous and abusable by request
    smuggling attacks.
    
    x-ref:
      "Content-Length request header is optional"
      https://redmine.lighttpd.net/issues/3273

May 27 '25 07:05 gstrauss

@gstrauss avoiding 411 responses is a compelling argument to change the logic to send C-L for anything but GET and HEAD, thank you. This is now in #4340. But FTR: Irrespective of this, the question of VCL control remains open.

Jun 02 '25 10:06 nigoroll

Coming back to VCL control over C-L:: unset xyz.http.Content-Length and set xyz.http.Content-Length = "0" can be made safe by disregarding any body, basically implying unset xyz.body. So all we need to prevent is setting Content-Length to a non-zero value. I really do not see why we should have any additional complications from the VCL perspective.

Jun 02 '25 12:06 nigoroll

The reason lighttpd rejects missing Content-Length with POST is as a defense against request smuggling and request splitting attacks, as noted in the commit message you linked above.

@gstrauss, I'm aware, I actually mentioned it:

Based on the commit message though, it sounds like an uphill battle.

What surprises me most is the limited scope of this smuggling mitigation, quoting myself again:

I'm also generally surprised that the handling is purposefully stricter than what the RFC states, and yet it's limited to a single HTTP method.

In other words, since this mitigation is targeting POST only, it "opens" (or leaves open) a smuggling vector for other methods like PUT.

Jun 02 '25 14:06 dridi