Caddy crashes with too many PHP fatal errors
What happened?
If the website produces several PHP fatal errors, the entire web server crashes. In our application, it is sometimes intended that requests that are too long are aborted with a timeout, which leads to a PHP fatal error. Since our application runs on Kubernetes, this leads to a complete restart of the pod.
However, the problem can also be reproduced with the Docker image Frankenphp-Demo (https://github.com/dunglas/frankenphp-demo). Here, too, the entire Docker container terminates.
This can be easily reproduced with the following lines in the Symfony controller:
set_time_limit(1);
sleep(2);
The reason for this is the following behavior: https://frankenphp.dev/docs/worker/#worker-failures
Is there a way to prevent the entire web server process from crashing? I would expect only the worker to be reset, for example.
Build Type
Docker (Debian Bookworm)
Worker Mode
Yes
Operating System
GNU/Linux
CPU Architecture
x86_64
PHP configuration
Standard PHP configuration from Frankenphp-Demo Docker Image.
Relevant log output
{"level":"panic","ts":1737056646.7622817,"msg":"too many consecutive worker failures","worker":"/app/public/index.php","failures":6}
panic: too many consecutive worker failures
goroutine 103 [running, locked to thread]:
go.uber.org/zap/zapcore.CheckWriteAction.OnWrite(0x80?, 0x2?, {0x4000ca8900?, 0x0?, 0x0?})
/root/go/pkg/mod/go.uber.org/[email protected]/zapcore/entry.go:196 +0x78
go.uber.org/zap/zapcore.(*CheckedEntry).Write(0x40002f5930, {0x4000525480, 0x2, 0x2})
/root/go/pkg/mod/go.uber.org/[email protected]/zapcore/entry.go:262 +0x1c4
go.uber.org/zap.(*Logger).Panic(0x400046e640?, {0x1b03066?, 0x15?}, {0x4000525480, 0x2, 0x2})
/root/go/pkg/mod/go.uber.org/[email protected]/logger.go:285 +0x54
github.com/dunglas/frankenphp.tearDownWorkerScript(0x400038e030, 0xff)
/go/src/app/thread-worker.go:135 +0x3e0
github.com/dunglas/frankenphp.(*workerThread).afterScriptExecution(0x4000955e38?, 0x40a588?)
/go/src/app/thread-worker.go:61 +0x1c
github.com/dunglas/frankenphp.go_frankenphp_after_script_execution(0x28501?, 0x0?)
/go/src/app/phpthread.go:109 +0x50
I think the only way to currently disable this behavior would be by enabling the watcher (and maybe have it watch an empty directory).
Ideally though, the crashing would only happen when the server is absolutely irrecoverable. TBH I don't remember for which exact scenarios this behavior was originally introduced (just for crashes on startup?). Maybe @withinboredom can help out.
This sounds like it is a bug. This behavior should only happen if the worker script crashes and never from a request crashing. I'll take a gander today.
The issue is that a php timeout during a request is an irrecoverable php fatal error. This causes the worker itself to crash, which was counting as a worker failure. I addressed this in #1336 by ignoring fatal errors from requests.
This isn't ideal, since like you said, it should be able to recover in the worker script itself and reset the worker, without having to pay a penalty to restart the worker and initialise the application again. I consider #1336 to be more of a hotfix than an actual fix, as an actual fix to better handle fatal errors from requests needs some thought.
This build in check seems like a terrible idea. So if there are connection issues or website is under DDOS, franken will simply flst out crash and not recover? Why nobody thought of letting users disable that? You should not force this upon anyone. There might be legitimate uses/situations, just like the creator of this issue has. It's bonkers to have to restart half of your infra to workaroud it