testcontainers-go icon indicating copy to clipboard operation
testcontainers-go copied to clipboard

[Bug]: TestContainers Mapped Ports sometimes points to a wrong instance of a container

Open aaomidi opened this issue 1 year ago • 0 comments

Testcontainers version

v0.33.0

Using the latest Testcontainers version?

Yes

Host OS

Darwin

Host arch

ARM

Go version

1.22

Docker version

Client:
 Version:           26.1.3
 API version:       1.45
 Go version:        go1.21.10
 Git commit:        b72abbb
 Built:             Thu May 16 08:30:38 2024
 OS/Arch:           darwin/arm64
 Context:           orbstack

Server: Docker Engine - Community
 Engine:
  Version:          26.1.4
  API version:      1.45 (minimum version 1.24)
  Go version:       go1.21.11
  Git commit:       de5c9cf
  Built:            Wed Jun  5 11:29:18 2024
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          v1.7.19
  GitCommit:        2bf793ef6dc9a18e00cb12efb64355c2c9d5eb41
 runc:
  Version:          1.1.13
  GitCommit:        58aa9203c123022138b22cf96540c284876a7910
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Docker info

Client:
 Version:    26.1.3
 Context:    orbstack
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.15.1
    Path:     /Users/amir/.docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.27.3
    Path:     /Users/amir/.docker/cli-plugins/docker-compose

Server:
 Containers: 4
  Running: 4
  Paused: 0
  Stopped: 0
 Images: 695
 Server Version: 26.1.4
 Storage Driver: overlay2
  Backing Filesystem: btrfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 2bf793ef6dc9a18e00cb12efb64355c2c9d5eb41
 runc version: 58aa9203c123022138b22cf96540c284876a7910
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.9.8-orbstack-00170-g7b4100b7ced4
 Operating System: OrbStack
 OSType: linux
 Architecture: aarch64
 CPUs: 12
 Total Memory: 15.59GiB
 Name: orbstack
 ID: c009c543-cd1c-4337-9325-f163a10d19d8
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
 Product License: Community Engine
 Default Address Pools:
   Base: 192.168.215.0/24, Size: 24
   Base: 192.168.228.0/24, Size: 24
   Base: 192.168.247.0/24, Size: 24
   Base: 192.168.207.0/24, Size: 24
   Base: 192.168.167.0/24, Size: 24
   Base: 192.168.107.0/24, Size: 24
   Base: 192.168.237.0/24, Size: 24
   Base: 192.168.148.0/24, Size: 24
   Base: 192.168.214.0/24, Size: 24
   Base: 192.168.165.0/24, Size: 24
   Base: 192.168.227.0/24, Size: 24
   Base: 192.168.181.0/24, Size: 24
   Base: 192.168.158.0/24, Size: 24
   Base: 192.168.117.0/24, Size: 24
   Base: 192.168.155.0/24, Size: 24
   Base: 192.168.147.0/24, Size: 24
   Base: 192.168.229.0/24, Size: 24
   Base: 192.168.183.0/24, Size: 24
   Base: 192.168.156.0/24, Size: 24
   Base: 192.168.97.0/24, Size: 24
   Base: 192.168.171.0/24, Size: 24
   Base: 192.168.186.0/24, Size: 24
   Base: 192.168.216.0/24, Size: 24
   Base: 192.168.242.0/24, Size: 24
   Base: 192.168.166.0/24, Size: 24
   Base: 192.168.239.0/24, Size: 24
   Base: 192.168.223.0/24, Size: 24
   Base: 192.168.164.0/24, Size: 24
   Base: 192.168.163.0/24, Size: 24
   Base: 192.168.172.0/24, Size: 24
   Base: 172.17.0.0/16, Size: 16
   Base: 172.18.0.0/16, Size: 16
   Base: 172.19.0.0/16, Size: 16
   Base: 172.20.0.0/14, Size: 16
   Base: 172.24.0.0/14, Size: 16
   Base: 172.28.0.0/14, Size: 16

What happened?

We were seeing some flaky tests when we were using testcontainers to run a container (clickhouse, but not using the testcontainer clickhouse module) in parallel. The patch of this code (unfortunately private, but I can share the patch) that fixed it is:

diff --git a/events/internal/clickhousetest/db.go b/events/internal/clickhousetest/db.go
index 0e8fa83d..30889eda 100644
--- a/events/internal/clickhousetest/db.go
+++ b/events/internal/clickhousetest/db.go
@@ -118,8 +118,7 @@ func startContainer(t *testing.T) string {
 		}
 	})
 
-	mappedPort, err := container.MappedPort(context.Background(), "9000/tcp")
-	require.NoError(t, err, "failed to get clickhouse mapped port")
-
-	return fmt.Sprintf("localhost:%d", mappedPort.Int())
+	ip, err := container.ContainerIP(context.Background())
+	require.NoError(t, err)
+	return fmt.Sprintf("%s:%d", ip, 9000)
 }

The fix was talking to the container directly, and bypassing host networking entirely.

Before this change, the behavior I was seeing that, during runs of four tests in the same module (each test also has a t.Parallel()), the test would be establishing a connection to clickhouse, running a query, disconnecting and reconnecting to that container and sometimes it would end up in the wrong container.

I did ensure that each test gets its own separate container (spent a few hours sanity checking myself here), and that the tests never reuse any existing container.

Relevant log output

No response

Additional information

No response

aaomidi avatar Aug 27 '24 00:08 aaomidi