copilot-cli icon indicating copy to clipboard operation
copilot-cli copied to clipboard

Service connect not working

Open Folyd opened this issue 2 years ago • 6 comments

Why my prometheus can't access my pushgateway? They both in the same ECS cluster.

2024-03-01T07:17:30.603Z warn internal/transaction.go:123 Failed to scrape Prometheus endpoint 
{"kind": "receiver", "name": "prometheus", "data_type": "metrics", "scrape_timestamp": 1709277450602, 
"target_labels": "{__name__=\"up\", instance=\"pushgateway:9091\", job=\"pushgatewy\"}"}

I did set the port 9091 and enable service connect:

name: pushgateway
type: Backend Service
image:
  location: prom/pushgateway
  port: 9091

exec: true # Enable running commands in your container.
network:
  connect: true 

Folyd avatar Mar 01 '24 07:03 Folyd

Service connect is still working for me! Can you first run copilot svc show --name pushgateway to note its service connect endpoint (should be pushgateway:9091), and copilot svc exec --name prometheus to enter your prometheus service container, and try curl -L pushgateway:9091?

Lou1415926 avatar Mar 01 '24 21:03 Lou1415926

$ copilot svc show --name pushgateway
About

  Application  chatbot
  Name         pushgateway
  Type         Backend Service

Configurations

  Environment  Tasks     CPU (vCPU)  Memory (MiB)  Platform      Port
  -----------  -----     ----------  ------------  --------      ----
  stage        1         0.25        512           LINUX/X86_64  9091

Internal Service Endpoints

  Endpoint                              Environment  Type
  --------                              -----------  ----
  pushgateway:9091                      stage        Service Connect
  pushgateway.stage.chatbot.local:9091  stage        Service Discovery

I'm using adot-collector (https://aws-otel.github.io/docs/getting-started/prometheus-remote-write-exporter) to scrape prometheus metrics, it doesn't support exec into the container. :(

$ copilot svc exec --name adot-collector --env stage
Execute `/bin/sh` in container adot-collector in task 152db713bf024c8c8834a82d5a4846b3.

Starting session with SessionId: ecs-execute-command-0be813afa7d7b6e84


SessionId: ecs-execute-command-0be813afa7d7b6e84 :
----------ERROR-------
Unable to start command: Failed to start pty: fork/exec /bin/sh: no such file or directory

Folyd avatar Mar 01 '24 23:03 Folyd

However, I can push my app's metrics to pushgateway:9091, that app also live in the same ECS cluster.

Folyd avatar Mar 01 '24 23:03 Folyd

After change to ip, it works:

         - job_name: pushgatewy
           honor_labels: true
           static_configs:
-            - targets: ['pushgateway:9091']
+            - targets: ['10.0.2.168:9091']

So, I think the network is connectable, but the pushgateway is not resolved.

Is because adot-collector has Client side only service connect rather than Client and server?

image

Folyd avatar Mar 02 '24 00:03 Folyd

Thanks @Lou1415926. After I exposed the container port of adot-collector, it auto change to Client and server, now the pushgateway:9091 is accessable. Not sure why the port is required. 😅

FROM public.ecr.aws/aws-observability/aws-otel-collector:latest
COPY --from=base /otel-config.yaml /etc/ecs/otel-config.yaml
+ EXPOSE 4317
image:
  # Docker build arguments. For additional overrides: https://aws.github.io/copilot-cli/docs/manifest/backend-service/#image-build
  build: adot.Dockerfile
+  port: 4317 # add port to enable 'client and server' service connect

Folyd avatar Mar 02 '24 01:03 Folyd

oh yeah "pushgateway" does need to be a service connect server (in addition to / instead of a client) to be able to receive service connect inbound traffic! The image.port is required because a service connect server needs to designate at least one port as the traffic-receiving port. Glad it got resolved!

Lou1415926 avatar Mar 04 '24 19:03 Lou1415926