caddy icon indicating copy to clipboard operation
caddy copied to clipboard

Panic and crash loop after POST-ing TLS config via admin API

Open jm-positron opened this issue 6 months ago • 8 comments

Overview

I POST-ed to the admin config endpoint to try and configure TLS, and that caused a repeated panic-crash loop in caddy.

Version 2.10.0

This is the repeated panic:

goroutine 12783 [running]:
net/http.(*conn).serve.func1()
	net/http/server.go:1947 +0xbe
panic({0x17973a0?, 0x2bab020?})
	runtime/panic.go:792 +0x132
github.com/caddyserver/caddy/v2/modules/metrics.(*AdminMetrics).serveHTTP(...)
	github.com/caddyserver/caddy/[email protected]/modules/metrics/adminmetrics.go:65
github.com/caddyserver/caddy/v2.AdminHandlerFunc.ServeHTTP(0x419474?, {0x1ed0e70?, 0xc000914030?}, 0x1ccbb90?)
	github.com/caddyserver/caddy/[email protected]/admin.go:1288 +0x29
github.com/caddyserver/caddy/v2.(*AdminConfig).newAdminHandler.(*AdminConfig).newAdminHandler.func2.func6({0x1ed0e70, 0xc000914030}, 0xc000430140)
	github.com/caddyserver/caddy/[email protected]/admin.go:242 +0x7b
net/http.HandlerFunc.ServeHTTP(0x1ed11d0?, {0x1ed0e70?, 0xc000914030?}, 0xc000640000?)
	net/http/server.go:2294 +0x29
github.com/caddyserver/caddy/v2.(*AdminConfig).newAdminHandler.func1.instrumentHandlerCounter.1({0x1ed11d0?, 0xc00041a1c0?}, 0xc000430140)
	github.com/caddyserver/caddy/[email protected]/metrics.go:56 +0x6f
net/http.HandlerFunc.ServeHTTP(0xc00099e000?, {0x1ed11d0?, 0xc00041a1c0?}, 0x726b7f?)
	net/http/server.go:2294 +0x29
net/http.(*ServeMux).ServeHTTP(0xc00099e000?, {0x1ed11d0, 0xc00041a1c0}, 0xc000430140)
	net/http/server.go:2822 +0x1c4
github.com/caddyserver/caddy/v2.adminHandler.serveHTTP({0xc00099e000, 0x0, 0x1, {0xc00046b020, 0x3, 0x3}, 0x0}, {0x1ed11d0, 0xc00041a1c0}, 0xc000430140)
	github.com/caddyserver/caddy/[email protected]/admin.go:829 +0x53f
github.com/caddyserver/caddy/v2.adminHandler.ServeHTTP({0xc00099e000, 0x0, 0x1, {0xc00046b020, 0x3, 0x3}, 0x0}, {0x1ed11d0, 0xc00041a1c0}, 0xc000430140)
	github.com/caddyserver/caddy/[email protected]/admin.go:781 +0x7e9
net/http.serverHandler.ServeHTTP({0xc00016a180?}, {0x1ed11d0?, 0xc00041a1c0?}, 0x1?)
	net/http/server.go:3301 +0x8e
net/http.(*conn).serve(0xc00088a000, {0x1ed44f8, 0xc00015acc0})
	net/http/server.go:2102 +0x625
created by net/http.(*Server).Serve in goroutine 12787
	net/http/server.go:3454 +0x485"}

Background

I was attempting to update the TLS config and set an automatic issuer, so I was writing to the URL /config/apps/tls/automation/policies.

{
    "subjects": [
        "myservice.internal"
    ],
    "issuers": [
        {
            "module": "internal"
        }
    ]
}

However, that was not working; I received the error invalid traversal path at: config/apps/tls/automation caddy.

I found this thread: https://github.com/caddyserver/caddy/issues/3501, with the following comment:

In your case, just do POST /config/ with the whole body (sans servers) and then it should be easy to add a server after that.

I thought I might be hitting that issue, so I POST-ed an empty configuration at /config:

{
    "apps": {
        "http": {},
        "tls": {
            "automation": {}
        }
    }
}

I then attempting to post my TLS config again, at the previous endpoint /config/apps/tls/automation/policies. This resulted in the panic-crash loop.

The entire interaction was automated, so I have recreated the request bodies for this report- I didn't record the exact requests in the log output.

The only seemingly relevant line from the caddy log (before the panic) is:

{"level":"error","ts":1750121211.771274,"logger":"admin.api","msg":"request error","error":"loading new config: loading tls app module: decoding module config: tls: json: cannot unmarshal object into Go struct field AutomationConfig.automation.policies of type []*caddytls.AutomationPolicy","status_code":500}

jm-positron avatar Jun 17 '25 01:06 jm-positron

the policies field is a slice type, you may try to send. @jm-positron

[{
    "subjects": [
        "myservice.internal"
    ],
    "issuers": [
        {
            "module": "internal"
        }
    ]
}]

qdongxu avatar Jun 22 '25 13:06 qdongxu

@mohammed90 Who do we know who understands the admin metrics stuff? Is @hairyhenderson the only one? The last time that code was touched was like 5 years ago if I read the blame correctly.

mholt avatar Jul 09 '25 20:07 mholt

@mohammed90 Who do we know who understands the admin metrics stuff? Is @hairyhenderson the only one? The last time that code was touched was like 5 years ago if I read the blame correctly.

Yes, Dave is probably the only one. I haven't checked which lines this fails at, but it's possible the metrics part is red-herring because the metrics handler is a middleware on all requests serving as an accountant. Does it really panic in the metrics-section?

mohammed90 avatar Jul 09 '25 20:07 mohammed90

Evidently the top of the stack trace is:

github.com/caddyserver/caddy/v2/modules/metrics.(*AdminMetrics).serveHTTP(...)
	github.com/caddyserver/caddy/[email protected]/modules/metrics/adminmetrics.go:65

mholt avatar Jul 09 '25 21:07 mholt

I wonder if the issue is the metricsHandler field is nil for some reason 🤔

https://github.com/caddyserver/caddy/blob/fb22a26b1a08a2fa3b2526d1852467904ee140f6/modules/metrics/adminmetrics.go#L65

mohammed90 avatar Jul 09 '25 21:07 mohammed90

I wonder if the issue is the metricsHandler field is nil for some reason 🤔

if it is, that means Provision wasn't called - is that possible?

hairyhenderson avatar Jul 13 '25 23:07 hairyhenderson

FWIW I've tried (unsuccessfully) to reproduce this... @jm-positron is there any more information you can give us? How are you launching Caddy? Can you provide your config?

hairyhenderson avatar Jul 13 '25 23:07 hairyhenderson

I've reproduced this behavior. The following is a description of the situation in which panic occurred. GET /metrics seems to cause panic.

Steps to reproduce

  1. Run caddy server v2.10.0 (built with xcaddy)
$ ./caddy version
v2.10.0 h1:fonubSaQKF1YANl8TXqGcn4IbIRUDdfAkpcsfI/vX5U=

$ ./caddy run
  1. POST an empty configuration at /config/ and an policies config at /config/apps/tls/automation/policies/ . At this point, an error log is displayed, but there is no panic.
curl -d @tls-auto-config.json -H 'Content-Type: application/json' localhost:2019/config/ -v
...
< HTTP/1.1 200 OK
< Date: Thu, 17 Jul 2025 17:04:38 GMT
< Content-Length: 0
< Connection: close

curl -d @panic-config.json -H 'Content-Type: application/json' localhost:2019/config/apps/tls/automation/policies/ -v
...
< HTTP/1.1 500 Internal Server Error
< Content-Type: application/json
< Date: Thu, 17 Jul 2025 17:04:38 GMT
< Content-Length: 268
<
{"error":"loading new config: loading http app module: provision http: getting tls app: loading tls app module: decoding module config: tls: json: cannot unmarshal object into Go struct field AutomationConfig.automation.policies of type []*caddytls.AutomationPolicy"}
* Connection #0 to host localhost left intact
$ cat tls-auto-config.json 
{
    "apps": {
        "http": {},
        "tls": {
            "automation": {}
        }
    }
}
$ cat panic-config.json 
{
    "subjects": [
        "myservice.internal"
    ],
    "issuers": [
        {
            "module": "internal"
        }
    ]
}
  1. Then, GET /metrics and panic happens repeatedly.
$ curl http://localhost:2019/metrics
curl: (52) Empty reply from server
panic log (caddy run )
$ ./caddy run
2025/07/17 16:47:09.431 INFO    maxprocs: Leaving GOMAXPROCS=16: CPU quota undefined
2025/07/17 16:47:09.431 INFO    GOMEMLIMIT is updated   {"package": "github.com/KimMachineGun/automemlimit/memlimit", "GOMEMLIMIT": 15096036556, "previous": 9223372036854775807}
2025/07/17 16:47:09.448 INFO    admin   admin endpoint started  {"address": "localhost:2019", "enforce_origin": false, "origins": ["//localhost:2019", "//[::1]:2019", "//127.0.0.1:2019"]}
2025/07/17 16:47:09.448 INFO    serving initial configuration
2025/07/17 16:47:17.826 INFO    admin.api       received request        {"method": "POST", "host": "localhost:2019", "uri": "/config/", "remote_ip": "127.0.0.1", "remote_port": "34860", "headers": {"Accept":["*/*"],"Content-Length":["129"],"Content-Type":["application/json"],"User-Agent":["curl/8.5.0"]}}
2025/07/17 16:47:17.827 INFO    admin   admin endpoint started  {"address": "localhost:2019", "enforce_origin": false, "origins": ["//localhost:2019", "//[::1]:2019", "//127.0.0.1:2019"]}
2025/07/17 16:47:17.827 WARN    http.auto_https server is listening only on the HTTP port, so no automatic HTTPS will be applied to this server     {"server_name": "metrics", "http_port": 80}
2025/07/17 16:47:17.827 INFO    http.log        server running  {"name": "metrics", "protocols": ["h1", "h2", "h3"]}        
2025/07/17 16:47:17.827 INFO    tls.cache.maintenance   started background certificate maintenance      {"cache": "0xc000343d00"}
2025/07/17 16:47:17.828 INFO    autosaved config (load with --resume flag)      {"file": "/home/kaytaka/.config/caddy/autosave.json"}
2025/07/17 16:47:17.830 INFO    admin   stopped previous server {"address": "localhost:2019"}
2025/07/17 16:47:17.833 INFO    tls     storage cleaning happened too recently; skipping for now        {"storage": "FileStorage:/home/kaytaka/.local/share/caddy", "instance": "258fa399-ae84-43ef-ae67-787f357529d5", "try_again": "2025/07/18 16:47:17.833", "try_again_in": 86399.999999612}
2025/07/17 16:47:17.833 INFO    tls     finished cleaning storage units
2025/07/17 16:47:17.839 INFO    admin.api       received request        {"method": "POST", "host": "localhost:2019", "uri": "/config/apps/tls/automation/policies/", "remote_ip": "127.0.0.1", "remote_port": "34868", "headers": {"Accept":["*/*"],"Content-Length":["124"],"Content-Type":["application/json"],"User-Agent":["curl/8.5.0"]}}
2025/07/17 16:47:17.839 INFO    admin   admin endpoint started  {"address": "localhost:2019", "enforce_origin": false, "origins": ["//localhost:2019", "//[::1]:2019", "//127.0.0.1:2019"]}
2025/07/17 16:47:17.839 ERROR   admin.api       request error   {"error": "loading new config: loading http app module: provision http: getting tls app: loading tls app module: decoding module config: tls: json: cannot unmarshal object into Go struct field AutomationConfig.automation.policies of type []*caddytls.AutomationPolicy", "status_code": 500}
2025/07/17 16:47:17.839 INFO    admin   stopped previous server {"address": "localhost:2019"}
2025/07/17 16:47:33.111 INFO    admin.api       received request        {"method": "GET", "host": "localhost:2019", "uri": "/config/", "remote_ip": "127.0.0.1", "remote_port": "37050", "headers": {"Accept":["text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7"],"Accept-Encoding":["gzip, deflate, br, zstd"],"Accept-Language":["ja-JP,ja;q=0.9,en-US;q=0.8,en;q=0.7"],"Connection":["keep-alive"],"If-None-Match":["\"/config/ 63018a6e457cbd46\""],"Sec-Ch-Ua":["\"Not)A;Brand\";v=\"8\", \"Chromium\";v=\"138\", \"Google Chrome\";v=\"138\""],"Sec-Ch-Ua-Mobile":["?0"],"Sec-Ch-Ua-Platform":["\"Windows\""],"Sec-Fetch-Dest":["document"],"Sec-Fetch-Mode":["navigate"],"Sec-Fetch-Site":["none"],"Sec-Fetch-User":["?1"],"Upgrade-Insecure-Requests":["1"],"User-Agent":["Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"]}}
2025/07/17 16:47:40.633 INFO    admin.api       received request        {"method": "GET", "host": "localhost:2019", "uri": "/config/", "remote_ip": "127.0.0.1", "remote_port": "37050", "headers": {"Accept":["text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7"],"Accept-Encoding":["gzip, deflate, br, zstd"],"Accept-Language":["ja-JP,ja;q=0.9,en-US;q=0.8,en;q=0.7"],"Connection":["keep-alive"],"If-None-Match":["\"/config/ 63018a6e457cbd46\""],"Sec-Ch-Ua":["\"Not)A;Brand\";v=\"8\", \"Chromium\";v=\"138\", \"Google Chrome\";v=\"138\""],"Sec-Ch-Ua-Mobile":["?0"],"Sec-Ch-Ua-Platform":["\"Windows\""],"Sec-Fetch-Dest":["document"],"Sec-Fetch-Mode":["navigate"],"Sec-Fetch-Site":["none"],"Sec-Fetch-User":["?1"],"Upgrade-Insecure-Requests":["1"],"User-Agent":["Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"]}}

2025/07/17 16:47:47.698 INFO    http: panic serving 127.0.0.1:37050: runtime error: invalid memory address or nil pointer dereference
goroutine 15 [running]:
net/http.(*conn).serve.func1()
        net/http/server.go:1947 +0xbe
panic({0x1798340?, 0x2bacfa0?})
        runtime/panic.go:792 +0x132
github.com/caddyserver/caddy/v2/modules/metrics.(*AdminMetrics).serveHTTP(...)
        github.com/caddyserver/caddy/[email protected]/modules/metrics/adminmetrics.go:65
github.com/caddyserver/caddy/v2.AdminHandlerFunc.ServeHTTP(0x419474?, {0x1ed2770?, 0xc00031a300?}, 0xc000891610?)
        github.com/caddyserver/caddy/[email protected]/admin.go:1288 +0x29
github.com/caddyserver/caddy/v2.(*AdminConfig).newAdminHandler.(*AdminConfig).newAdminHandler.func2.func6({0x1ed2770, 0xc00031a300}, 0xc00012e280)
        github.com/caddyserver/caddy/[email protected]/admin.go:242 +0x7b
net/http.HandlerFunc.ServeHTTP(0x1ed2ad0?, {0x1ed2770?, 0xc00031a300?}, 0x476c7a?)
        net/http/server.go:2294 +0x29
github.com/caddyserver/caddy/v2.(*AdminConfig).newAdminHandler.func1.instrumentHandlerCounter.1({0x1ed2ad0?, 0xc00001a460?}, 0xc00012e280)
        github.com/caddyserver/caddy/[email protected]/metrics.go:56 +0x6f
net/http.HandlerFunc.ServeHTTP(0xc000196e40?, {0x1ed2ad0?, 0xc00001a460?}, 0x7275df?)
        net/http/server.go:2294 +0x29
net/http.(*ServeMux).ServeHTTP(0xc000196e40?, {0x1ed2ad0, 0xc00001a460}, 0xc00012e280)
        net/http/server.go:2822 +0x1c4
github.com/caddyserver/caddy/v2.adminHandler.serveHTTP({0xc000196e40, 0x0, 0x1, {0xc00067bb48, 0x3, 0x3}, 0x0}, {0x1ed2ad0, 0xc00001a460}, 0xc00012e280)
        github.com/caddyserver/caddy/[email protected]/admin.go:829 +0x53f
github.com/caddyserver/caddy/v2.adminHandler.ServeHTTP({0xc000196e40, 0x0, 0x1, {0xc00067bb48, 0x3, 0x3}, 0x0}, {0x1ed2ad0, 0xc00001a460}, 0xc00012e280)
        github.com/caddyserver/caddy/[email protected]/admin.go:781 +0x7e9
net/http.serverHandler.ServeHTTP({0xc000a001b0?}, {0x1ed2ad0?, 0xc00001a460?}, 0x6?)
        net/http/server.go:3301 +0x8e
net/http.(*conn).serve(0xc000552630, {0x1ed5df8, 0xc000020210})
        net/http/server.go:2102 +0x625
created by net/http.(*Server).Serve in goroutine 81
        net/http/server.go:3454 +0x485

[same panic messages were displayed repeatedly]
``` 

</details>

kaytaka0 avatar Jul 17 '25 17:07 kaytaka0