go-liftbridge icon indicating copy to clipboard operation
go-liftbridge copied to clipboard

Connection Issue to Liftbridge Service from Dockerized Go Client

Open umermasood opened this issue 3 months ago • 0 comments

Description

I am experiencing a persistent issue where my Dockerized Go application using the go-liftbridge client fails to connect to the Liftbridge service, despite correct DNS resolution and explicitly providing the resolved IP and port. The client attempts to connect to [::1]:9292 instead of the provided IP address.

Environment

  • Liftbridge Version: v1.9.0
  • Go Liftbridge Client Version: v2.3.0
  • Go Version: 1.22
  • Operating System: Pop!_OS 22.04 LTS x86_64
  • Docker Version: v26.0.1
  • Docker Compose Version: v2.26.1

Steps to Reproduce

  1. Dockerize a simple Go application that uses the go-liftbridge client.
  2. Use Docker Compose to manage the Liftbridge service and the Go application.
  3. Try to connect to the Liftbridge service using an IP address resolved from the service's hostname within Docker Compose.

The connection string is passed correctly and even hardcoding the IP address directly in the client results in attempts to connect to [::1].

Expected Behavior

The client should connect to the Liftbridge service using the provided IP address and port.

Actual Behavior

The client ignores the provided IP address and port and attempts to connect to [::1]:9292, resulting in a connection refused error.

DNS Resolution

I modified the code to check connectivity:

	// DNS Resolution Check
	addrs, err := net.LookupHost(cfg.liftbridgeIP)
	if err != nil {
		log.Fatalf("Failed to resolve host: %v", err)
	}
	log.Printf("Resolved liftbridge to: %v", addrs)

This is resolved and can be seen in the logs as well.

Relevant Logs

➜  liftbridge-py-pg docker compose up
[+] Running 1/3
 ✔ Network liftbridge-py-pg_default                            Created                                                                                    0.1s
 ⠙ Container liftbridge-py-pg-nats-1                           Created                                                                                    0.1s
 ⠋ Container liftbridge-py-pg-liftbridge-1                     Created                                                                                    0.1s
 ⠋ Container liftbridge-py-pg-go-liftbridge-spark-connector-1  Created                                                                                    0.0s
Attaching to go-liftbridge-spark-connector-1, liftbridge-1, nats-1
nats-1                           | [1] 2024/05/13 08:37:47.590769 [INF] Starting nats-server
nats-1                           | [1] 2024/05/13 08:37:47.590859 [INF]   Version:  2.10.12
nats-1                           | [1] 2024/05/13 08:37:47.590860 [INF]   Git:      [121169ea]
nats-1                           | [1] 2024/05/13 08:37:47.590862 [INF]   Cluster:  my_cluster
nats-1                           | [1] 2024/05/13 08:37:47.590863 [INF]   Name:     NCCYRUELLOO7RICOGEE2HQ6I656KIW47K2M6IVAA6V46ME7Y6GKFMWKR
nats-1                           | [1] 2024/05/13 08:37:47.590868 [INF]   ID:       NCCYRUELLOO7RICOGEE2HQ6I656KIW47K2M6IVAA6V46ME7Y6GKFMWKR
nats-1                           | [1] 2024/05/13 08:37:47.590877 [INF] Using configuration file: nats-server.conf
nats-1                           | [1] 2024/05/13 08:37:47.591453 [INF] Starting http monitor on 0.0.0.0:8222
nats-1                           | [1] 2024/05/13 08:37:47.591504 [INF] Listening for client connections on 0.0.0.0:4222
nats-1                           | [1] 2024/05/13 08:37:47.591662 [INF] Server is ready
nats-1                           | [1] 2024/05/13 08:37:47.591680 [INF] Cluster name is my_cluster
nats-1                           | [1] 2024/05/13 08:37:47.591744 [INF] Listening for route connections on 0.0.0.0:6222
liftbridge-1                     | time="2024-05-13 08:37:47" level=info msg="Liftbridge Version:        v1.9.0"
liftbridge-1                     | time="2024-05-13 08:37:47" level=info msg="Server ID:                 8yiHlBt3t111bjg2XhgQSt"
liftbridge-1                     | time="2024-05-13 08:37:47" level=info msg="Namespace:                 liftbridge-default"
liftbridge-1                     | time="2024-05-13 08:37:47" level=info msg="NATS Servers:              [nats://nats:4222]"
liftbridge-1                     | time="2024-05-13 08:37:47" level=info msg="Default Retention Policy:  [Age: 1 week, Compact: false]"
liftbridge-1                     | time="2024-05-13 08:37:47" level=info msg="Default Partition Pausing: disabled"
liftbridge-1                     | time="2024-05-13 08:37:47" level=info msg="Starting Liftbridge server on 0.0.0.0:9292..."
liftbridge-1                     | time="2024-05-13 08:37:48" level=info msg="Server became metadata leader, performing leader promotion actions"
go-liftbridge-spark-connector-1  | 2024/05/13 08:37:53 Resolved liftbridge to: [192.168.16.3]
go-liftbridge-spark-connector-1  | 2024/05/13 08:37:53 Connecting to Liftbridge server at 192.168.16.3:9292
go-liftbridge-spark-connector-1  | 2024/05/13 08:37:53 Attempt 1 to connect to Liftbridge at addresses: [192.168.16.3:9292]
go-liftbridge-spark-connector-1  | 2024/05/13 08:37:53 Failed to connect to Liftbridge (1/5): rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp [::1]:9292: connect: connection refused"
go-liftbridge-spark-connector-1  | 2024/05/13 08:37:55 Attempt 2 to connect to Liftbridge at addresses: [192.168.16.3:9292]
go-liftbridge-spark-connector-1  | 2024/05/13 08:37:55 Failed to connect to Liftbridge (2/5): rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp [::1]:9292: connect: connection refused"
go-liftbridge-spark-connector-1  | 2024/05/13 08:37:57 Attempt 3 to connect to Liftbridge at addresses: [192.168.16.3:9292]
go-liftbridge-spark-connector-1  | 2024/05/13 08:37:57 Failed to connect to Liftbridge (3/5): rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp [::1]:9292: connect: connection refused"
go-liftbridge-spark-connector-1  | 2024/05/13 08:37:59 Attempt 4 to connect to Liftbridge at addresses: [192.168.16.3:9292]
go-liftbridge-spark-connector-1  | 2024/05/13 08:37:59 Failed to connect to Liftbridge (4/5): rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp [::1]:9292: connect: connection refused"
go-liftbridge-spark-connector-1  | 2024/05/13 08:38:01 Attempt 5 to connect to Liftbridge at addresses: [192.168.16.3:9292]
go-liftbridge-spark-connector-1  | 2024/05/13 08:38:01 Failed to connect to Liftbridge (5/5): rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp [::1]:9292: connect: connection refused"
go-liftbridge-spark-connector-1  | 2024/05/13 08:38:03 Failed to connect to Liftbridge server: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp [::1]:9292: connect: connection refused"
go-liftbridge-spark-connector-1 exited with code 1

Additional Information

  • Networking setup in Docker Compose ensures all services are on the same network.
  • No relevant environmental variables or network policies are influencing the behavior.
  • DNS resolution within the container confirms the correct IP address is resolved for the service name.

This issue seems to occur specifically within the Dockerized environment, and the same setup works when not running inside Docker. Any guidance or suggestions on what might be causing this behavior would be greatly appreciated.

umermasood avatar May 14 '24 07:05 umermasood