caddy icon indicating copy to clipboard operation
caddy copied to clipboard

[encode] Encoding bug when handle_errors is invoked

Open bminer opened this issue 1 year ago • 8 comments

The HTTP response body is corrupted due to an encoding bug, which occurs when combining encode and handle_errors directives, especially with file_server and templates.

What We Expect

Caddy will use the HTTP response from handle_errors to determine how to encode the response and send appropriate headers (i.e. Content-Encoding).

What Happens

EDIT: ~~Caddy will start compressing the payload and send Content-Encoding headers; however,~~ When an error occurs and the HTTP response is set by the handle_errors directive, the response body somehow gets corrupted. I believe this is the case even for buffered HTTP responses.

To reproduce

  • encode directive is used to enable encoding. It does not matter which encoding format is used.
  • the client sends appropriate Accept-Encoding headers to enable the encoding middleware
  • the file_server directive serves a file larger than the minimumLength for the encode directive
  • there was some sort of error that occurs during the request that causes the handle_errors directive to be invoked

Example:

Consider this Caddyfile configuration:

example.com {
	encode zstd gzip
	handle_errors {
		respond "Error"
	}
	root * /var/www/example.com
	templates
	file_server
}

Suppose file /var/www/example.com/test.html is greater than 512 bytes and contains a template error of some kind. In this case, Caddy is supposed to respond with "Error" but instead delivers a corrupted HTTP response. In the above example, the HTTP gzip-encoded response body looks like this (encoded in hexadecimal):

1f8b 0800 0009 6e88 00ff 0300 0000 0000 0000 0000 4572 726f 72

Notice the last 5 bytes are the string "Error" but the first bytes appear to be a valid gzip encoding of 0 bytes.

I am happy to write a PR for this if I can get some pointers on where to look.

bminer avatar Oct 10 '24 03:10 bminer

I wonder if this is the same issue I have recently encountered while testing with v2.9.0 beta (using promote-metrics branch) because I was facing an issue when my browse template had an error and I have handle_errors enabled as well. In the browser, I just received some kind of h2 ssl error (can't recall the exact wording). So far, I have not found the time to create a minimal reproducer (just fixed by browse template and error was gone).

Do you have DEBUG enabled and if so, do you see something like this in your logs? This was logged when I had the broken browse template:

2024-10-07T13:44:39.636+0200    DEBUG   http.stdlib     http2: panic serving 10.101.11.10:46736: runtime error: invalid memory address or nil pointer dereference
goroutine 34720 [running]:
golang.org/x/net/http2.(*serverConn).runHandler.func1()
        golang.org/x/[email protected]/http2/server.go:2407 +0x145
panic({0x1841b80?, 0x2c689a0?})
        runtime/panic.go:785 +0x132
github.com/caddyserver/caddy/v2/modules/caddyhttp.(*Server).ServeHTTP(0xc002ad2388, {0x1f75e60, 0xc00013b098}, 0xc001f84640)
        github.com/caddyserver/caddy/[email protected]/modules/caddyhttp/server.go:419 +0x15df
net/http.serverHandler.ServeHTTP({0xb675a9?}, {0x1f75e60?, 0xc00013b098?}, 0x7f540?)
        net/http/server.go:3210 +0x8e
net/http.initALPNRequest.ServeHTTP({{0x1f796b0?, 0xc001845860?}, 0xc003bbe388?, {0xc0005d6690?}}, {0x1f75e60, 0xc00013b098}, 0xc001f84640)
        net/http/server.go:3819 +0x231
golang.org/x/net/http2.(*serverConn).runHandler(0x4475bb?, 0x0?, 0x0?, 0xc0028841a0?)
        golang.org/x/[email protected]/http2/server.go:2414 +0xf5
created by golang.org/x/net/http2.(*serverConn).scheduleHandler in goroutine 34578
        golang.org/x/[email protected]/http2/server.go:2348 +0x21d

steffenbusch avatar Oct 10 '24 08:10 steffenbusch

Can you give an example of a file being served and a template that produces the error? Having a actual reproducer would help.

I'm not sure how this could be solved. If Caddy already started streaming the response, it's kinda "too late".

I think templates would have to buffer the entire response (which is heavier on memory usage) so that it only bubbles up the write to encode if there's no error in processing the template (whereas by default it streams the response). We might need to have that as an opt-in option for templates with a note in the docs saying "if you combine templates with encode and your template has the risk of throwing an error, enable template buffering".

francislavoie avatar Oct 10 '24 09:10 francislavoie

Here's an example template that has a bug. Just save as index.html and serve it up. Note that the file size is just above the 512 bytes needed to activate the encode middleware.

<!doctype html>
<html lang="en">
	<head>
		<meta charset="utf-8" />
		<meta name="viewport" content="width=device-width, initial-scale=1.0" />
		<title>Caddy Bug Demo</title>
		<meta name="author" content="Blake Miner" />
		<meta name="description" content="Blake is a software developer who obsesses about databases" />
	</head>
	<body>
		<main>
			<h1>Test</h1>
			<p>Here is the bug: {{ poop }}</p>
		</main>
		<footer>
			&copy; Copyright 2023-2024. Blake Miner. All rights reserved.
		</footer>
	</body>
</html>

bminer avatar Oct 11 '24 19:10 bminer

I'm not sure how this could be solved. If Caddy already started streaming the response, it's kinda "too late".

I hear what you're saying here, but somehow handle_errors works fine when the encode directive is removed. BUT... there are still some issues because the Content-Length is incorrect, for example.

> GET / HTTP/2
> Host: example.com
> user-agent: curl/7.81.0
> accept: */*
> 

< HTTP/2 500 
< accept-ranges: bytes
< alt-svc: h3=":443"; ma=2592000
< content-type: text/html; charset=utf-8
< etag: "d4t7yzm5gh8ved"
< last-modified: Fri, 11 Oct 2024 19:21:52 GMT
< server: Caddy
< vary: Accept-Encoding
< content-length: 517
< date: Fri, 11 Oct 2024 19:28:22 GMT
< 
* TLSv1.2 (IN), TLS header, Supplemental data (23):
* TLSv1.2 (OUT), TLS header, Supplemental data (23):
* HTTP/2 stream 0 was not closed cleanly: PROTOCOL_ERROR (err 1)
* stopped the pause stream!
* Connection #0 to host example.com left intact
curl: (92) HTTP/2 stream 0 was not closed cleanly: PROTOCOL_ERROR (err 1)
Error

Note that the last line of the curl output "Error" is the output from handle_errors

bminer avatar Oct 11 '24 19:10 bminer

I also think template errors should be emitted on the http.log.error logger with the WARN log level instead of DEBUG. Or, at least at the INFO level...

bminer avatar Oct 11 '24 19:10 bminer

I hear what you're saying here, but somehow handle_errors works fine when the encode directive is removed.

The encode directive wraps the response writer, so if a response starts being streamed, the encode wrapper makes decisions about how to manipulate the response. If an error happens after that point in the response flow, the response will be broken either way... just in different ways.

the Content-Length is incorrect, for example.

I would guess that has to do with one handler flow writing the headers that it expects to be correct, and then an error happening later that causes breakage.

I also think template errors should be emitted on the http.log.error logger with the WARN log level instead of DEBUG. Or, at least at the INFO level...

That logger is generally for HTTP requests, not internal errors... why do you have this request?

mholt avatar Oct 11 '24 19:10 mholt

If an error happens after that point in the response flow, the response will be broken either way... just in different ways.

Makes sense. I don't know if HTTP provides a mechanism to send a proper response once the stream is broken (i.e. render a 500.html page if an error occurs with a previously streamed response body).... especially when Content-Encoding is used.

I would guess that has to do with one handler flow writing the headers that it expects to be correct, and then an error happening later that causes breakage.

Makes sense.

That logger is generally for HTTP requests, not internal errors... why do you have this request?

Somewhat unrelated to this issue, but my understanding is that template errors propagate through to the http.log.error logger. They don't show up in http.log.access at all, even when enabled with a log directive.

bminer avatar Oct 11 '24 19:10 bminer

After thinking about this issue a bit more, I want to provide some clarification:

  1. Once HTTP status code and headers are sent, handle_errors cannot modify the HTTP response anymore; it's too late. Best thing Caddy can do at this point is log the error. This answer on SO describes some ways to tell the HTTP client that the response failed. When Content-Encoding is used, the response body will probably not be successfully decoded when an error like this happens, which is good.

  2. As @francislavoie mentioned:

I think templates would have to buffer the entire response (which is heavier on memory usage) so that it only bubbles up the write to encode if there's no error in processing the template (whereas by default it streams the response).

It is my understanding that Caddy does this. It reads the template from the recorded (i.e. buffered) HTTP response that was buffered by the next middleware (i.e. file_server in this example). Since the request is buffered and the error occurs before the HTTP status code is written, we should be able to do something useful in handle_errors.

I think that the true nature of the bug report really boils down to how the encode middleware interacts with the handle_errors middleware. I'm speculating that buffered responses like templates with errors in them reveal the nature of the underlying bug.

bminer avatar Oct 14 '24 21:10 bminer

Here is my proposal to fix this bug.

The encode middleware partially initializes compression, but when template processing fails, errorHandlerChain reuses the same response writer, resulting in mixed gzip headers with plain text error messages.

The encode responseWriter starts compression initialization but doesn't detect when downstream handlers return errors, leading to corrupted output when error handlers process the response.

Proposed Fix

Add a disabled flag to the encode responseWriter struct. When an error occurs in the handler chain, set this flag to prevent encoder initialization during error response processing.

Code Changes

// In responseWriter struct
disabled bool // disable encoding (for error responses)

// In ServeHTTP method
err := next.ServeHTTP(w, r)

if err != nil {
    if ew, ok := w.(*responseWriter); ok {
        ew.disabled = true
    }
}

// In init() method
if rw.disabled {
    return
}

Execution Flow

  1. encode creates responseWriter wrapper
  2. templates encounters error (e.g., {{ poop }})
  3. ServeHTTP returns err != nil
  4. encode sets disabled = true
  5. handle_errors invokes respond "Error: Status 500"
  6. respond writes to responseWriter
  7. responseWriter.init() sees disabled = true -> skips compression
  8. Client receives clean plain text

Siomachkin avatar Aug 31 '25 21:08 Siomachkin

@Siomachkin - you are on the right track here. The encode responseWriter wrapper needs to know when an error has occurred in the ServeHTTP chain and prevent the writer from writing if it hasn't already.

bminer avatar Sep 02 '25 12:09 bminer