[Bug]: Deadlock between DockerClientFactory and RyukResourceReaper with JUnit 5 parallel tests
Module
Core
Testcontainers version
1.20.1
Using the latest Testcontainers version?
Yes
Host OS
MacOS
Host Arch
arm64
Docker version
Client:
Version: 27.1.1
API version: 1.46
Go version: go1.21.12
Git commit: 6312585
Built: Tue Jul 23 19:54:12 2024
OS/Arch: darwin/arm64
Context: desktop-linux
Server: Docker Desktop 4.33.0 (160616)
Engine:
Version: 27.1.1
API version: 1.46 (minimum version 1.24)
Go version: go1.21.12
Git commit: cc13f95
Built: Tue Jul 23 19:57:14 2024
OS/Arch: linux/arm64
Experimental: false
containerd:
Version: 1.7.19
GitCommit: 2bf793ef6dc9a18e00cb12efb64355c2c9d5eb41
runc:
Version: 1.7.19
GitCommit: v1.1.13-0-g58aa920
docker-init:
Version: 0.19.0
GitCommit: de40ad0
What happened?
I'm attempting to run tests in parallel with JUnit 5. One test spins up a static ComposeContainer with .withLocalCompose(true) and another spins up a static KafkaContainer. This leads to a deadlock on startup, where one thread acquires the lock on RyukResourceReaper and then fails to acquire the lock in DockerClientFactory, while the other thread does the opposite.
Relevant log output
Kafka container thread:
"testcontainers-lifecycle-0" #35 [41731] daemon prio=5 os_prio=31 cpu=207.18ms elapsed=23.45s tid=0x000000012225ba00 nid=41731 waiting for monitor entry [0x00000001735fa000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.testcontainers.utility.RyukResourceReaper.maybeStart(RyukResourceReaper.java:74)
- waiting to lock <0x000000060201f118> (a org.testcontainers.utility.RyukResourceReaper)
at org.testcontainers.utility.RyukResourceReaper.init(RyukResourceReaper.java:42)
at org.testcontainers.DockerClientFactory.client(DockerClientFactory.java:232)
- locked <0x000000060201ee60> (a [Ljava.lang.Object;)
at org.testcontainers.DockerClientFactory$1.getDockerClient(DockerClientFactory.java:106)
at com.github.dockerjava.api.DockerClientDelegate.authConfig(DockerClientDelegate.java:109)
at org.testcontainers.containers.GenericContainer.start(GenericContainer.java:329)
Compose container thread:
"testcontainers-lifecycle-1" #37 [37891] daemon prio=5 os_prio=31 cpu=3.14ms elapsed=23.41s tid=0x0000000122254600 nid=37891 waiting for monitor entry [0x0000000173a12000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.testcontainers.DockerClientFactory.client(DockerClientFactory.java:185)
- waiting to lock <0x000000060201ee60> (a [Ljava.lang.Object;)
at org.testcontainers.DockerClientFactory$1.getDockerClient(DockerClientFactory.java:106)
at com.github.dockerjava.api.DockerClientDelegate.authConfig(DockerClientDelegate.java:109)
at org.testcontainers.containers.GenericContainer.start(GenericContainer.java:329)
at org.testcontainers.utility.RyukResourceReaper.maybeStart(RyukResourceReaper.java:78)
- locked <0x000000060201f118> (a org.testcontainers.utility.RyukResourceReaper)
at org.testcontainers.utility.RyukResourceReaper.registerLabelsFilterForCleanup(RyukResourceReaper.java:51)
at org.testcontainers.containers.ComposeDelegate.registerContainersForShutdown(ComposeDelegate.java:247)
at org.testcontainers.containers.ComposeContainer.start(ComposeContainer.java:125)
- locked <0x000000060201f2a8> (a java.lang.Object)
Additional Information
No response
Hi @pkwarren, can you please provide a project that reproduces the issue?
Here's an example repo showing the problem: https://github.com/pkwarren/testcontainers-issue-9120
Thanks for sharing @pkwarren. I did some changes because the docker-compose.yml file was not found and also had to set version but can not reproduce the issue. Do you mind taking a look?
the docker-compose.yml file was not found
It should be here: https://github.com/pkwarren/testcontainers-issue-9120/blob/main/docker-compose.yml
also had to set version
I don't follow - where did a version need to be specified?
can not reproduce the issue. Do you mind taking a look?
If you could provide more specifics on what you're doing and any errors you're seeing I'd be happy to update the example project. For me just running ./mvnw clean verify hangs - if you use jstack to look at the PID of the launched Maven surefire process you can see the deadlock.
the docker-compose.yml file was not found
I had to change from ComposeContainer("docker-compose.yml") to ComposeContainer(new File ("docker-compose.yml"))
I don't follow - where did a version need to be specified?
I was talking about version in docker-compose.yml file. But executing again, I don't need it anymore.
I jus wanted to make sure we have the same code to reproduce. After that, just ran ./mvnw clean verify and everything executed successfully. I am also running on Mac M1 Pro.
Pushed updates to fix the ComposeContainer constructor usage and switched the container in docker-compose.yml to be Kafka (in case we're running into a race condition and starting up nginx is too fast to repro the problem). Hopefully this will allow you to see the same behavior I'm seeing.
I'm on the latest version of Docker desktop (v4.33.0) if it matters.
Is there any progress on this? I have the exact same problem.