testcontainers-python
testcontainers-python copied to clipboard
Bug: Starting multiple containers at once in parallel causes Conflict by starting ryuk container multiple times due to race condition
Describe the bug
Starting two testcontainers at once triggers both of them to call Reaper._create_instance close in time. Both Reaper instances attempt to create the same ryuk container (since SESSION_ID is the same).
This is relevant when using pytest-parallel, which my organization does. I can probably work around the problem by serializing testcontainer start() calls on one thread.
To Reproduce
from testcontainers.postgres import PostgresContainer
import concurrent.futures
def start_postgres_container(port: int):
with PostgresContainer(port=port) as postgres:
print("Postgres is running on port", postgres.get_exposed_port(port))
with concurrent.futures.ThreadPoolExecutor() as executor:
future1 = executor.submit(start_postgres_container, port=5400)
# Uncommenting this reliably avoids the error
# import time
# time.sleep(5)
future2 = executor.submit(start_postgres_container, port=5401)
future1.result()
future2.result()
Runtime environment
Provide a summary of your runtime environment. Which operating system, python version, and docker version are you using? What is the version of testcontainers-python you are using? You can run the following commands to get the relevant information.
# Get the operating system information (on a unix os).
$ uname -a
Darwin xxx 24.1.0 Darwin Kernel Version 24.1.0: Thu Oct 10 21:03:11 PDT 2024; root:xnu-11215.41.3~2/RELEASE_ARM64_T6020 arm64 arm Darwin
$ python --version
Python 3.11.10
$ docker info
# Get all python packages.
$ pip freeze
certifi==2024.8.30
charset-normalizer==3.4.0
docker==7.1.0
idna==3.10
requests==2.32.3
testcontainers==4.8.2
typing_extensions==4.12.2
urllib3==2.2.3
wrapt==1.17.0
I encountered the same thing myself, I wanted to create the dockers asynchronously but I couldn't due to this. I had to override the relevant method (but it's really hacky to do so).
Would love to see this fixed.
Huge thumps up! I would love to see a recommended way to start several containers at once, that would greatly improve test run times. Also tried with multithreading and hit the same issue.
Does this actually run faster than just initializing the containers sequentially? The internal API is single threaded
I'm looking at doing:
from testcontainers.postgres import PostgresContainer
import concurrent.futures
from testcontainers.core.container import Reaper
def start_postgres_container(port: int):
with PostgresContainer(port=port) as postgres:
print("Postgres is running on port", postgres.get_exposed_port(port))
def start_pgs(num_to_start: int):
Reaper.get_instance() # start reaper once...
futures = []
with concurrent.futures.ThreadPoolExecutor() as executor:
for i in range(num_to_start):
ifuture = executor.submit(start_postgres_container, port=5400 + i)
futures.append(ifuture)
for ifuture in concurrent.futures.as_completed(futures):
res = ifuture.result()
print(res)
def main():
start_pgs(20)
if __name__ == "__main__":
main()
or even
def patch_reaper_get_instance():
real_reaper_get_instance = Reaper.get_instance
reaper_lock = threading.Lock()
def patched_get_instance():
with reaper_lock:
return real_reaper_get_instance()
Reaper.get_instance = patched_get_instance
and then calling patch_reaper_get_instance before starting any containers, which is maybe close to what @UltimateLobster was thinking of.
is it just as simple as putting a lock around Reaper.get_instance