docker-nginx-controller icon indicating copy to clipboard operation
docker-nginx-controller copied to clipboard

Could not resolve host: controller hostname

Open vincentmli opened this issue 2 years ago • 8 comments

while building the image, ran into

#8 38.61 + curl -k -sS -L https://ngx-ctlr-master.localhost/install/controller-agent
#8 38.67 curl: (6) Could not resolve host: ngx-ctlr-master.localhost

the ngx-ctlr-master.localhost is the nginx controller name, I don't have any DNS setup for the build host machine to resolve nginx controller host: ngx-ctlr-master.localhost, tried to add ngx-ctlr-master.localhost in build host machine /etc/hosts, does not work. then i added docker argument --add-host "ngx-ctlr-master.localhost:10.154.120.180", this worked

/usr/src/docker-nginx-controller/ubuntu/no-nap# DOCKER_BUILDKIT=1 docker build --build-arg CONTROLLER_URL=https://ngx-ctlr-master.localhost/install/controller-agent --build-arg API_KEY='495ad61b1beb6c0cfbd4e0eafd22ac9b' --add-host "ngx-ctlr-master.localhost:10.154.120.180" --secret id=nginx-crt,src=nginx-repo.crt --secret id=nginx-key,src=nginx-repo.key -t nginx-agent .
[+] Building 61.8s (10/10) FINISHED                                                                                   
 => [internal] load build definition from Dockerfile                                                             0.0s
 => => transferring dockerfile: 32B                                                                              0.0s
 => [internal] load .dockerignore                                                                                0.0s
 => => transferring context: 2B                                                                                  0.0s
 => [internal] load metadata for docker.io/library/ubuntu:18.04                                                  1.1s
 => [1/5] FROM docker.io/library/ubuntu:18.04@sha256:0fedbd5bd9fb72089c7bbca476949e10593cebed9b1fb9edf5b79dbbac  0.0s
 => [internal] load build context                                                                                0.0s
 => => transferring context: 74B                                                                                 0.0s
 => CACHED [2/5] COPY nginx-plus-api.conf /etc/nginx/conf.d/                                                     0.0s
 => CACHED [3/5] COPY entrypoint.sh /                                                                            0.0s
 => [4/5] RUN --mount=type=secret,id=nginx-crt,dst=/etc/ssl/nginx/nginx-repo.crt,mode=0644   --mount=type=secr  57.5s
 => [5/5] RUN ln -sf /proc/1/fd/1 /var/log/nginx-controller/agent.log   && ln -sf /proc/1/fd/2 /var/log/nginx/e  1.3s 
 => exporting to image                                                                                           1.8s 
 => => exporting layers                                                                                          1.8s 
 => => writing image sha256:49c21afb52599e9796b63690f572f8461f5d3c9610d966f349cd783d5793c050                     0.0s 
 => => naming to docker.io/library/nginx-agent                                                     

https://github.com/moby/moby/issues/34078 says the host:ip mapping will not persist in the image /etc/hosts file, wonder if this could be a problem

vincentmli avatar Nov 03 '21 21:11 vincentmli

there does seems to be some problem,

/usr/src/docker-nginx-controller/ubuntu/no-nap# docker ps

CONTAINER ID   IMAGE         COMMAND               CREATED          STATUS          PORTS     NAMES
7a3e37caaa3b   nginx-agent   "sh /entrypoint.sh"   12 minutes ago   Up 12 minutes   80/tcp    mynginx1

/usr/src/docker-nginx-controller/ubuntu/no-nap# docker logs 7a3e37caaa3b

starting nginx ...
waiting for nginx workers ...
2021/11/03 21:05:50 [notice] 8#8: using the "epoll" event method
2021/11/03 21:05:50 [notice] 8#8: nginx/1.21.3 (nginx-plus-r25)
2021/11/03 21:05:50 [notice] 8#8: built by gcc 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04) 
2021/11/03 21:05:50 [notice] 8#8: OS: Linux 5.12.0-051200-generic
2021/11/03 21:05:50 [notice] 8#8: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2021/11/03 21:05:50 [notice] 8#8: start worker processes
2021/11/03 21:05:50 [notice] 8#8: start worker process 11
2021/11/03 21:05:50 [notice] 8#8: start worker process 12
2021/11/03 21:05:50 [notice] 8#8: start worker process 13
2021/11/03 21:05:50 [notice] 8#8: start worker process 14
2021/11/03 21:05:50 [notice] 8#8: start worker process 15
2021/11/03 21:05:50 [notice] 8#8: start worker process 16
2021/11/03 21:05:50 [notice] 8#8: start worker process 17
2021/11/03 21:05:50 [notice] 8#8: start worker process 18
updating /etc/controller-agent/agent.conf ...
 ---> using api_key = 495ad61b1beb6c0cfbd4e0eafd22ac9b
 ---> using instance_name = 7a3e37caaa3b
starting controller-agent ...
time="Nov  3 2021 21:05:53.293" level="info" msg="Starting NGINX Controller (Go) Agent. Version: 3.20.5-354136471.release-3-20..." feature="main"
time="Nov  3 2021 21:05:53.296" level="info" msg="Number of NGINX instances discovered" count="1" feature="main"

I do not see the nginx plus instance added in nginx controller ngx-ctlr-master.localhost

no tail log from agent log

docker exec 7a3e37caaa3b tail /var/log/nginx-controller/agent.log

vincentmli avatar Nov 03 '21 21:11 vincentmli

Do you have any orphaned instances in Controller? You should see a new instance "7a3e37caaa3b" Can the instance resolve and connect to the Controller? Could not resolve host: ngx-ctlr-master.localhost I would say not. You need to change the fqdn of your Controller to one that is resolvable and reachable by agents.

brianehlert avatar Nov 03 '21 21:11 brianehlert

hm, after I deploy the image in k3s as pod with hostNetwork:true which i assume the pod would have access to host machine /etc/hosts where I manually added the controller host ip:name mapping, and able to use host node ip 10.154.120.204 to connect to the controller 10.154.120.180, then the instance is discovered by the controller, but so far I am unable to see logs in controller agent.log like we usually see in non-container deployment

# cat nginx-agent.yaml 
apiVersion: v1
kind: Pod
metadata:
 name: nginx-agent 
spec:
 hostNetwork: true
 containers:
   - name: nginx-agent 
     image: vli39/nginx-agent:test
     securityContext:
       capabilities:
         add:
           - NET_ADMIN
           - NET_RAW

kubectl get po -o wide

NAME          READY   STATUS    RESTARTS   AGE    IP               NODE            NOMINATED NODE   READINESS GATES
nginx-agent   1/1     Running   0          4m4s   10.154.120.204   cilium-worker   <none>           <none>

cat /etc/hosts

127.0.0.1	localhost.localdomain	localhost
::1		localhost6.localdomain6	localhost6
127.0.0.1	cilium-worker
10.154.120.180	ngx-ctlr-master.localhost

kubectl exec -it nginx-agent -- head -20 /etc/nginx-controller/agent.conf

[credentials]
api_key = 495ad61b1beb6c0cfbd4e0eafd22ac9b
hostname = 
uuid = 
imagename = 
store_uuid = False
tags = 

[agent]
launchers = 

[proxies]
https = 

[cloud]
api_url = ngx-ctlr-master.localhost:443
api_timeout = 5.0
max_retention = 360
verify_ssl_cert = false
requests_ca_bundle = 

vincentmli avatar Nov 03 '21 22:11 vincentmli

Do you have any orphaned instances in Controller? You should see a new instance "7a3e37caaa3b" Can the instance resolve and connect to the Controller? Could not resolve host: ngx-ctlr-master.localhost I would say not. You need to change the fqdn of your Controller to one that is resolvable and reachable by agents.

the problem is when building the image, the ubuntu 18.04 container does not have access to host machine /etc/hosts, adding docker build argument --add-host will resolve the docker build problem, but running nginx agent container may require proper DNS setup for container to resolve the controller FQDN which I do not have for a quick experiment lab setup

vincentmli avatar Nov 03 '21 22:11 vincentmli

Use the controller ip address for this: api_url = ngx-ctlr-master.localhost:443 Provide the IP instead. Easy peasy. It can also be over-ridden at container deployment time using a docker env var. Don't focus literally on the string Controller provides - agent needs to reach out to Controller, in whatever way that works for your environment. Controller listens for agents on all IPs of its machine.

brianehlert avatar Nov 03 '21 23:11 brianehlert

Use the controller ip address for this: api_url = ngx-ctlr-master.localhost:443 Provide the IP instead. Easy peasy. It can also be over-ridden at container deployment time

I did use the IP address initially, but at later docker build stage, it complains something about controller hostname ngx-ctlr-master.localhost, that made me use the hostname.

vincentmli avatar Nov 03 '21 23:11 vincentmli

Use the controller ip address for this: api_url = ngx-ctlr-master.localhost:443 Provide the IP instead. Easy peasy. It can also be over-ridden at container deployment time

I did use the IP address initially, but at later docker build stage, it complains something about controller hostname ngx-ctlr-master.localhost, that made me use the hostname.

DOCKER_BUILDKIT=1 docker build --build-arg CONTROLLER_URL=https://10.154.120.180/install/controller-agent --build-arg API_KEY='495ad61b1beb6c0cfbd4e0eafd22ac9b' --secret id=nginx-crt,src=nginx-repo.crt --secret id=nginx-key,src=nginx-repo.key -t nginx-agent .


#9 72.25   6. Checking connectivity to the NGINX Controller ... failed to connect to the NGINX Controller
#9 72.32 
#9 72.32 curl: (6) Could not resolve host: ngx-ctlr-master.localhost
#9 72.32  Verify your network and firewall settings.
#9 72.32  exiting.

vincentmli avatar Nov 03 '21 23:11 vincentmli

the FQDN setting of the Controller platform drives this value. That should be set to the DNS Name you want the agents to resolve to connect to the Controller instance. This can be defined at Controller install time or after. This is the purpose of this value. It is then baked into the agent installation process.

brianehlert avatar Nov 05 '21 14:11 brianehlert