testcontainers-python icon indicating copy to clipboard operation
testcontainers-python copied to clipboard

Can't connect to PostgreSQL container from another container when running on Mac as host

Open illesguy opened this issue 3 years ago • 6 comments

In our current setup we have some Python code running on a docker container which uses testcontainers to create a PostgreSQL database container we can test with. All necessary requirements to start a new container from a docker image are met, and the host machine's docker client is used to start the new container. When looking at the DB container, it starts up successfully, we can access it from localhost with the port assigned to it when exposing and with the container's IP with the port we wanted to expose (5432). The problem comes if we try to access it with the Gateway IP of the network the 2 containers are in, and the port assigned to it when exposing (which is what testcontainers is doing as per my understanding).

This last part only fails when the host the 2 containers are started on is a MacOS machine, it works fine on Linux. To replicate the issue the following script can be run on a docker container that is started on a MacOS machine (with the host's docker client mounted up as per the requirements in this repo's README):

from testcontainers.postgres import PostgresContainer
import sqlalchemy

with PostgresContainer("postgres:9.6") as postgres:
    e = sqlalchemy.create_engine(postgres.get_connection_url())
    print(e.execute("select version()"))

We ran with Python 3.6 and newest versions of testcontainers, sqlalchemy and psycopg2. The above script should fail in a container started on Mac and work in a container started on Linux.

This is because the networking implementation is different between Linux and Mac. When exposing ports on Mac without providing a binding port (so docker finds a port for itself), the HostConfig/PortBindings for the newly started container gets set to "0". In contrast, it is left as an empty String on Linux. The "0" value causes the issue. We've opened a ticket about this on the docker for mac repo which has more info about this particular issue: https://github.com/docker/for-mac/issues/5588

Currently the work around we use in our code is to explicitly bind ports instead of just exposing them like so:

from testcontainers.postgres import PostgresContainer
import sqlalchemy

with PostgresContainer("postgres:9.6").with_bind_ports(5432, 47000) as postgres:
    e = sqlalchemy.create_engine(postgres.get_connection_url())
    print(e.execute("select version()"))

This version works on both Mac and Linux as the host because HostConfig/PortBindings gets updated with the port we've provided on containers started on both Mac and Linux. The downside is we always have to manually bind and find a port that is available.

The suggestion in the issue that I linked above is that for portability, the internal DNS and IP of containers should be used for communicating between containers rather than the Gateway IP. Would it make sense to make the change in testcontainers so it would use those to access the DB container rather than the Gateway IP? What was the specific reason for which Gateway IP was chosen as the preferred method of communication between containers?

I am happy to have a go at making the change in this repo if you accept contributions.

illesguy avatar Apr 21 '21 09:04 illesguy

Thank you for the detailed investigation. Contributions are most welcome. Maybe we should add a macOS CI run?

tillahoffmann avatar Apr 22 '21 13:04 tillahoffmann

I had a look at trying to fix this behaviour, found a couple of things. Looks like default behaviour was already the suggested one which is to use the container IP to access other containers see here: https://github.com/testcontainers/testcontainers-python/blob/master/testcontainers/core/container.py#L86-L96

The gateway IP is the fallback if the container's IP we were running from originally didn't match the gateway IP. This brings up my first question. Is it correct that on line 94 we are checking that the IP of the container we are running on is equal to the gateway IP? Should it just be in the same subnet, but no equal?

When digging deeper, I looked at the docker client file and found that it is always returning localhost as the host's IP regardless of whether we are running on our host machine or already on a docker container. This is the line I'm referring to: https://github.com/testcontainers/testcontainers-python/blob/master/testcontainers/core/docker_client.py#L61

My question is in the first part I linked, would it be possible to just always use the container IP of the container we want to call instead of having the fallback logic to the gateway IP? (since as it stands it always falls back to the gateway IP)

If that's not a viable option then can try, to fix the method to get the IP we are running on but in that case I think we also need to change the if statement here to check if it's in the same subnet as the gateway IP instead of being equal to it.

illesguy avatar May 02 '21 19:05 illesguy

Hi @illesguy, thanks for the thorough investigation. I also just stumpled upon this after updating to version 3.4.1 and running on MacOS (therefore having to use dockerhost).

I just suggested a change in https://github.com/testcontainers/testcontainers-python/pull/145

ghost avatar Jun 09 '21 20:06 ghost

@illesguy, let us know whether #145 (about to be pushed as 3.4.2 to pypi) fixes this issue.

tillahoffmann avatar Aug 15 '21 17:08 tillahoffmann

Did #145 fix this issue?

tillahoffmann avatar Feb 16 '23 22:02 tillahoffmann

@tillahoffmann

Did #145 fix this issue?

It did not. I still have this issue on Mac with docker desktop.

I can make it work in two different ways:

  1. Set the environment variable TC_HOST=172.17.0.1
  2. Using the with_bind_ports as suggest. postgres_container = PostgresContainer("postgres:14.5").with_bind_ports(5432, 47000) for example.

Happy to make a contribution to the documentation for this limitation if you find it appropriate.

Oscmage avatar Mar 19 '23 12:03 Oscmage