`io-uring` syscalls blocked by default on containerd > 1.7.0 security profiles
What happened:
Quilkin proxy panics on startup, as reported in this closed issue issue, where it was assumed to only occur on older versions of Linux. I'm seeing this occur on Ubuntu (24.04).
What you expected to happen:
Quilkin starts normally as it does on Ubuntu 22.04:
How to reproduce it (as minimally and precisely as possible):
As the OP in the link above did with this command:
docker run -e RUST_BACKTRACE=1 --network host us-docker.pkg.dev/quilkin/ci/quilkin:0.10.0-dev-083d425 proxy --port 28868 --to 127.0.0.1:28869
or with the latest release (0.9.0):
docker run --rm -e RUST_BACKTRACE=1 us-docker.pkg.dev/quilkin/release/quilkin:0.9.0 proxy --to 127.0.0.1:7778
Anything else we need to know?:
I'm seeing this after having installed Ubuntu 24.04 from scratch on a new system. It works fine on my Ubuntu 22.04 system. Luckily, Quilkin version 0.8.0 works on my new Ubuntu 24.04 system (so I've downgraded for the time being).
I installed docker using sudo apt install docker.io docker-compose-v2 on a cleanly installed and updated Ubuntu 24.04 system.
Environment:
- Quilkin version: 0.9.0 and 0.10.0-dev-083d425 (and newer versions)
- Execution environment (binary, container, etc): Docker version 26.1.3, build 26.1.3-0ubuntu1~24.04.1
- Operating system:
Linux Zen 6.11.0-21-generic #21~24.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon Feb 24 16:52:15 UTC 2 x86_64 x86_64 x86_64 GNU/Linux - Custom filters? No
- Others: (Ubuntu GLIBC 2.39-0ubuntu8.4) 2.39
Logs (Quilkin 0.10.0-dev-083d425)
$ docker run -e RUST_BACKTRACE=1 --network host us-docker.pkg.dev/quilkin/ci/quilkin:0.10.0-dev-083d425 proxy --port 28868 --to 127.0.0.1:28869
Unable to find image 'us-docker.pkg.dev/quilkin/ci/quilkin:0.10.0-dev-083d425' locally
0.10.0-dev-083d425: Pulling from quilkin/ci/quilkin
c6b97f964990: Pull complete
bfb59b82a9b6: Pull complete
8ffb3c3cf71a: Pull complete
a62778643d56: Pull complete
7c12895b777b: Pull complete
33e068de2649: Pull complete
5664b15f108b: Pull complete
0bab15eea81d: Pull complete
4aa0ea1413d3: Pull complete
da7816fa955e: Pull complete
9aee425378d2: Pull complete
06e8c7084bea: Pull complete
f823d6cf5f75: Pull complete
6f971e93c4e2: Pull complete
c83c31ce41af: Pull complete
0cb5c07f8edd: Pull complete
235b0d55b8e1: Pull complete
2520628d5c59: Pull complete
dfc844d5cefa: Pull complete
Digest: sha256:dc113d39ab1a775ca7c7fb5bfa373841fc34058637662e9db936663891084c5b
Status: Downloaded newer image for us-docker.pkg.dev/quilkin/ci/quilkin:0.10.0-dev-083d425
{"timestamp":"2025-03-26T21:13:28.545140Z","level":"INFO","fields":{"message":"Starting Quilkin","version":"0.10.0-dev","commit":"083d4255081d7525ab660afe1293b47a553fcfc4"},"target":"quilkin::cli","filename":"src/cli.rs","threadId":"ThreadId(1)"}
{"timestamp":"2025-03-26T21:13:28.545442Z","level":"INFO","fields":{"message":"Starting admin endpoint","address":"[::]:8000"},"target":"quilkin::components::admin","filename":"src/components/admin.rs","threadId":"ThreadId(1)"}
{"timestamp":"2025-03-26T21:13:28.545605Z","level":"INFO","fields":{"message":"Starting proxy","port":28868,"proxy_id":"Zen"},"target":"quilkin::cli::proxy","filename":"src/cli/proxy.rs","span":{"name":"run"},"spans":[{"name":"run"}],"threadId":"ThreadId(1)"}
thread 'main' panicked at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.39.2/src/runtime/blocking/shutdown.rs:51:21:
Cannot drop a runtime in a context where blocking is not allowed. This happens when a runtime is dropped from within an asynchronous context.
stack backtrace:
{"timestamp":"2025-03-26T21:13:28.546238Z","level":"ERROR","fields":{"message":"Panic has occurred. Moving to Unhealthy","panic_info":"panicked at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.39.2/src/runtime/blocking/shutdown.rs:51:21:\nCannot drop a runtime in a context where blocking is not allowed. This happens when a runtime is dropped from within an asynchronous context."},"target":"quilkin::components::admin::health","filename":"src/components/admin/health.rs","span":{"name":"run"},"spans":[{"name":"run"}],"threadId":"ThreadId(1)"}
0: rust_begin_unwind
1: core::panicking::panic_fmt
2: tokio::runtime::blocking::pool::BlockingPool::shutdown
3: core::ptr::drop_in_place<tokio::runtime::blocking::pool::BlockingPool>
4: quilkin::components::proxy::io_uring_shared::IoUringLoop::spawn
5: quilkin::components::proxy::Proxy::run::{{closure}}
6: quilkin::cli::proxy::Proxy::run::{{closure}}::{{closure}}
7: quilkin::cli::Cli::drive::{{closure}}::{{closure}}
8: quilkin::main::{{closure}}
9: quilkin::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
Logs (Quilkin 0.9.0)
$ docker run --rm -e RUST_BACKTRACE=1 us-docker.pkg.dev/quilkin/release/quilkin:0.9.0 proxy --to 127.0.0.1:7778
{"timestamp":"2025-03-26T21:25:21.043225Z","level":"INFO","fields":{"message":"Starting Quilkin","version":"0.9.0","commit":"b62ba024a0c7e5bc27eda6c2b785705ffe3d64bb"},"target":"quilkin::cli","filename":"src/cli.rs","threadId":"ThreadId(1)"}
{"timestamp":"2025-03-26T21:25:21.043499Z","level":"INFO","fields":{"message":"Starting admin endpoint","address":"[::]:8000"},"target":"quilkin::components::admin","filename":"src/components/admin.rs","threadId":"ThreadId(1)"}
{"timestamp":"2025-03-26T21:25:21.043729Z","level":"INFO","fields":{"message":"Starting proxy","port":7777,"proxy_id":"f0ac434e0816"},"target":"quilkin::cli::proxy","filename":"src/cli/proxy.rs","span":{"name":"run"},"spans":[{"name":"run"}],"threadId":"ThreadId(1)"}
thread 'main' panicked at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.39.2/src/runtime/blocking/shutdown.rs:51:21:
Cannot drop a runtime in a context where blocking is not allowed. This happens when a runtime is dropped from within an asynchronous context.
stack backtrace:
{"timestamp":"2025-03-26T21:25:21.044588Z","level":"ERROR","fields":{"message":"Panic has occurred. Moving to Unhealthy","panic_info":"panicked at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.39.2/src/runtime/blocking/shutdown.rs:51:21:\nCannot drop a runtime in a context where blocking is not allowed. This happens when a runtime is dropped from within an asynchronous context."},"target":"quilkin::components::admin::health","filename":"src/components/admin/health.rs","span":{"name":"run"},"spans":[{"name":"run"}],"threadId":"ThreadId(1)"}
0: rust_begin_unwind
at ./rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/std/src/panicking.rs:647:5
1: core::panicking::panic_fmt
at ./rustc/7cf61ebde7b22796c69757901dd346d0fe70bd97/library/core/src/panicking.rs:72:14
2: tokio::runtime::blocking::shutdown::Receiver::wait
3: tokio::runtime::blocking::pool::BlockingPool::shutdown
4: core::ptr::drop_in_place<tokio::runtime::blocking::pool::BlockingPool>
5: quilkin::components::proxy::io_uring_shared::IoUringLoop::spawn
6: quilkin::components::proxy::Proxy::run::{{closure}}
7: quilkin::cli::proxy::Proxy::run::{{closure}}::{{closure}}
8: <tracing::instrument::Instrumented<T> as core::future::future::Future>::poll
9: quilkin::cli::Cli::drive::{{closure}}::{{closure}}
10: tokio::runtime::park::CachedParkThread::block_on
11: tokio::runtime::runtime::Runtime::block_on
12: quilkin::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
I have some more information. I did a debug build of quilkin. It works on my Ubuntu 24.04 host, but not in the debian:bookwork-slim docker container I paced it. Running the debug build inside the container did hint at the problem:
2025-03-26T23:18:14.946117Z INFO ThreadId(01) quilkin::cli: src/cli.rs: Starting Quilkin version="0.10.0-dev"
2025-03-26T23:18:14.948044Z INFO ThreadId(01) quilkin::components::admin: src/components/admin.rs: Starting admin endpoint address=[::]:8000
2025-03-26T23:18:14.949591Z INFO ThreadId(01) run: quilkin::cli::proxy: src/cli/proxy.rs: Starting proxy port=7777 proxy_id="e94c442c3b94"
2025-03-26T23:18:14.950625Z INFO ThreadId(01) run: quilkin::cli::service: src/cli/service.rs: starting phoenix service port=7600
2025-03-26T23:18:14.951037Z INFO ThreadId(01) run: quilkin::cli::service: src/cli/service.rs: starting udp service port=7777
2025-03-26T23:18:14.952441Z INFO ThreadId(07) quilkin::net::phoenix: src/net/phoenix.rs: starting phoenix HTTP service addr=[::]:7600
2025-03-26T23:18:14.953161Z ERROR ThreadId(01) quilkin: src/main.rs: fatal error error=failed to spawn io-uring loop error_debug=failed to spawn io-uring loop
Caused by:
OS level error: Operation not permitted (os error 1)
So I ran the container with elevated privileges and it worked:
docker run --rm --cap-add=SYS_ADMIN --security-opt seccomp=unconfined -e RUST_BACKTRACE=1 us-docker.pkg.dev/quilkin/release/quilkin:0.9.0 proxy --to 127.0.0.1:7778
So, looks like its something to with differences in the docker setup between my older Ubuntu 22.04 system and my Ubuntu 24.04 system. Should I always have been running Quilkin inside a docker container with these additional privileges?
I've ensured the docker setup is the same on both systems and observed that --security-opt seccomp=unconfined on its own is sufficient to make it run. The Ubuntu 24.04 host seems to block io_uring in containers under default security policies; the necessary syscalls don't seem to be permitted.
I wonder if I'll have this problem if I try to deploy the container on something like AWS ECS or Kubernetes. That would be an issue.
UPDATE:
The issue is caused by changes in containerd ≥ 1.7.0, where io_uring_* syscalls (io_uring_setup, etc.) are now blocked by default in the seccomp profile used by Docker containers. This doesn't affect bare-metal or EC2 deployments where the process runs with full kernel permissions. However, in containerized environments (e.g., Docker, ECS, etc.), Quilkin fails to start unless the container is either:
- Granted elevated privileges (e.g.
--security-opt seccomp=unconfined) - Or Quilkin can avoid using
io_uringto avoid crashing.
This is the relevant PR for the change to containerd. My unbuntu 22.04 system has containerd version 1.6.22, which is why it's only been an issue since I've upgraded.
Thank you for your issue! That certainly is annoying. I think the solution is probably just adding another fallback, where it goes back to the already existing epoll implementation for UDP traffic that is there for Linux and Windows.
I assume there is a way we can query if we can call io_uring we can use for this.
I think a call to io_uring_setup() will return EPERM if there aren't sufficient permissions. That could be detected and used to fallback to the original epoll() mechanism. It might also return ENOSYS on older kernels? Alternatively, a runtime flag to disable io_uring explicitly might be another option.