Service connect not working
Why my prometheus can't access my pushgateway? They both in the same ECS cluster.
2024-03-01T07:17:30.603Z warn internal/transaction.go:123 Failed to scrape Prometheus endpoint
{"kind": "receiver", "name": "prometheus", "data_type": "metrics", "scrape_timestamp": 1709277450602,
"target_labels": "{__name__=\"up\", instance=\"pushgateway:9091\", job=\"pushgatewy\"}"}
I did set the port 9091 and enable service connect:
name: pushgateway
type: Backend Service
image:
location: prom/pushgateway
port: 9091
exec: true # Enable running commands in your container.
network:
connect: true
Service connect is still working for me! Can you first run copilot svc show --name pushgateway to note its service connect endpoint (should be pushgateway:9091), and copilot svc exec --name prometheus to enter your prometheus service container, and try curl -L pushgateway:9091?
$ copilot svc show --name pushgateway
About
Application chatbot
Name pushgateway
Type Backend Service
Configurations
Environment Tasks CPU (vCPU) Memory (MiB) Platform Port
----------- ----- ---------- ------------ -------- ----
stage 1 0.25 512 LINUX/X86_64 9091
Internal Service Endpoints
Endpoint Environment Type
-------- ----------- ----
pushgateway:9091 stage Service Connect
pushgateway.stage.chatbot.local:9091 stage Service Discovery
I'm using adot-collector (https://aws-otel.github.io/docs/getting-started/prometheus-remote-write-exporter) to scrape prometheus metrics, it doesn't support exec into the container. :(
$ copilot svc exec --name adot-collector --env stage
Execute `/bin/sh` in container adot-collector in task 152db713bf024c8c8834a82d5a4846b3.
Starting session with SessionId: ecs-execute-command-0be813afa7d7b6e84
SessionId: ecs-execute-command-0be813afa7d7b6e84 :
----------ERROR-------
Unable to start command: Failed to start pty: fork/exec /bin/sh: no such file or directory
However, I can push my app's metrics to pushgateway:9091, that app also live in the same ECS cluster.
After change to ip, it works:
- job_name: pushgatewy
honor_labels: true
static_configs:
- - targets: ['pushgateway:9091']
+ - targets: ['10.0.2.168:9091']
So, I think the network is connectable, but the pushgateway is not resolved.
Is because adot-collector has Client side only service connect rather than Client and server?
Thanks @Lou1415926. After I exposed the container port of adot-collector, it auto change to Client and server, now the pushgateway:9091 is accessable. Not sure why the port is required. 😅
FROM public.ecr.aws/aws-observability/aws-otel-collector:latest
COPY --from=base /otel-config.yaml /etc/ecs/otel-config.yaml
+ EXPOSE 4317
image:
# Docker build arguments. For additional overrides: https://aws.github.io/copilot-cli/docs/manifest/backend-service/#image-build
build: adot.Dockerfile
+ port: 4317 # add port to enable 'client and server' service connect
oh yeah "pushgateway" does need to be a service connect server (in addition to / instead of a client) to be able to receive service connect inbound traffic! The image.port is required because a service connect server needs to designate at least one port as the traffic-receiving port. Glad it got resolved!