runtime icon indicating copy to clipboard operation
runtime copied to clipboard

Ingress: Quickstart not working any more, ENDPOINT stays in pending state

Open stevel032 opened this issue 3 years ago • 7 comments

Hi guys, kudos to you for launching the new Acorn system. I tried out the quickstart (right on the docs home page: https://docs.acorn.io/) a few days ago and it worked really well. Today I tried it again and it doesn't work any more. Here are some of the details.

First of all, I have Docker Desktop on Mac, am using its built-in K8S system, everything seems to be in order:

$ kubectl version --short
... ...

Client Version: v1.24.1
Kustomize Version: v4.5.4
Server Version: v1.24.1

The Acornfile I have is from the docs, I haven't made any changes:

$ acorn --version
acorn version v0.1.0+80d6e93d

$ ls -a
.		..		Acornfile

$ cat Acornfile
containers: {
 web: {
  image: "nginx"
  ports: publish: "80/http"
  files: {
   // Simple index.html file
   "/usr/share/nginx/html/index.html": "<h1>My First Acorn!</h1>"
  }
 }
}

Running the quickstart is very simple as well:

$ acorn run .
[+] Building 3.9s (5/5) FINISHED
 => [internal] load .dockerignore                                                                                                     0.0s
 => => transferring context: 2B                                                                                                       0.0s
 => [internal] load build definition from acorn-dockerfile-680879854                                                                  0.1s
 => => transferring dockerfile: 64B                                                                                                   0.1s
 => [internal] load metadata for docker.io/library/nginx:latest                                                                       3.7s
 => CACHED [1/1] FROM docker.io/library/nginx@sha256:ecc068890de55a75f1a32cc8063e79f90f0b043d70c5fcf28f1713395a4b3d49                 0.0s
 => => resolve docker.io/library/nginx@sha256:ecc068890de55a75f1a32cc8063e79f90f0b043d70c5fcf28f1713395a4b3d49                        0.0s
 => exporting to image                                                                                                                0.1s
 => => exporting layers                                                                                                               0.0s
 => => exporting manifest sha256:77c9bb357067366f2629112b4af9e008bd9ff1ebcaa97f99420b2dfbc6269155                                     0.0s
 => => exporting config sha256:f53af215ede0ee11e25dcfdd813625fa1564d3053721493a1cc5553317c87696                                       0.0s
 => => pushing layers                                                                                                                 0.0s
 => => pushing manifest for 127.0.0.1:5000/acorn/acorn:latest@sha256:77c9bb357067366f2629112b4af9e008bd9ff1ebcaa97f99420b2dfbc626915  0.0s
[+] Building 0.2s (5/5) FINISHED
 => [internal] load .dockerignore                                                                                                     0.0s
 => => transferring context: 64B                                                                                                      0.0s
 => [internal] load build definition from Dockerfile                                                                                  0.0s
 => => transferring dockerfile: 58B                                                                                                   0.0s
 => [internal] load build context                                                                                                     0.1s
 => => transferring context: 366B                                                                                                     0.1s
 => CACHED [1/1] COPY . /                                                                                                             0.0s
 => exporting to image                                                                                                                0.1s
 => => exporting layers                                                                                                               0.0s
 => => exporting manifest sha256:a1dd1fa387f2ed5e208780387bcdb719f0e2616e378c86be0df98564a865604e                                     0.0s
 => => exporting config sha256:67271949afe8d3df6153b3932b6eeb35eb0959d703a0e7ab146d6a48f706f1fc                                       0.0s
 => => pushing layers                                                                                                                 0.0s
 => => pushing manifest for 127.0.0.1:5000/acorn/acorn:latest@sha256:a1dd1fa387f2ed5e208780387bcdb719f0e2616e378c86be0df98564a865604  0.0s
wild-lake

Then I look at the app itself, the ENDPOINT stays in the pending state forever. A few days ago I was able to successfully see the DNS name there and was able to point my browser to the ENDPOINT URL and see the simple page successfully. But now I cannot do it any more.

$ acorn ps
NAME        IMAGE          HEALTHY   UP-TO-DATE   CREATED   ENDPOINTS                    MESSAGE
wild-lake   8506621381be   1         1            17m ago   http://<pending> => web:80   OK

I tried to pull the logs but can't see anything suspicious:

$ acorn logs wild-lake
web-5b766cb869-w9nvx: /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
web-5b766cb869-w9nvx: /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
web-5b766cb869-w9nvx: /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
web-5b766cb869-w9nvx: 10-listen-on-ipv6-by-default.sh: info: Getting the checksum of /etc/nginx/conf.d/default.conf
web-5b766cb869-w9nvx: 10-listen-on-ipv6-by-default.sh: info: Enabled listen on IPv6 in /etc/nginx/conf.d/default.conf
web-5b766cb869-w9nvx: /docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
web-5b766cb869-w9nvx: /docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh
web-5b766cb869-w9nvx: /docker-entrypoint.sh: Configuration complete; ready for start up
web-5b766cb869-w9nvx: 2022/08/08 05:12:01 [notice] 7#7: using the "epoll" event method
web-5b766cb869-w9nvx: 2022/08/08 05:12:01 [notice] 7#7: nginx/1.23.1
web-5b766cb869-w9nvx: 2022/08/08 05:12:01 [notice] 7#7: built by gcc 10.2.1 20210110 (Debian 10.2.1-6)
web-5b766cb869-w9nvx: 2022/08/08 05:12:01 [notice] 7#7: OS: Linux 5.10.104-linuxkit
web-5b766cb869-w9nvx: 2022/08/08 05:12:01 [notice] 7#7: getrlimit(RLIMIT_NOFILE): 1048576:1048576
web-5b766cb869-w9nvx: 2022/08/08 05:12:01 [notice] 7#7: start worker processes
web-5b766cb869-w9nvx: 2022/08/08 05:12:01 [notice] 7#7: start worker process 37
web-5b766cb869-w9nvx: 2022/08/08 05:12:01 [notice] 7#7: start worker process 38
web-5b766cb869-w9nvx: 2022/08/08 05:12:01 [notice] 7#7: start worker process 39
web-5b766cb869-w9nvx: 2022/08/08 05:12:01 [notice] 7#7: start worker process 40
web-5b766cb869-w9nvx: 2022/08/08 05:12:01 [notice] 7#7: start worker process 41
web-5b766cb869-w9nvx: 2022/08/08 05:12:01 [notice] 7#7: start worker process 42

Would you suggest how else I can perhaps debug this issue? Thank you all...

stevel032 avatar Aug 08 '22 05:08 stevel032

Can you see if Ingress controller got installed for your cluster ? As for Docker desktop , ingress controller will be installed and then the ingress gets created and then the endpoints will get populated

saiyam1814 avatar Aug 08 '22 05:08 saiyam1814

I'm using Docker Desktop for Mac as is, did not make any modifications. I'm not too familiar with the intricacies of ingress controllers, but I see there's bundled nginx ingress controller images in the Acorn image set, and they are running OK. Plus I had it working before, so I don't think my K8S setup is an issue.

However, just to eliminate the issue for sure, I restarted the Docket Desktop, and was still seeing the ENDPOINT in pending state.

stevel032 avatar Aug 08 '22 06:08 stevel032

Half an hour later, I looked into the docker images running, and saw two dead ones. Then a bit later, I saw them restarted/respawned:

$ docker ps -a
CONTAINER ID   IMAGE                  COMMAND                  CREATED          STATUS                      PORTS     NAMES
399c6df166dd   f53af215ede0           "/docker-entrypoint.…"   10 minutes ago   Up 10 minutes                         k8s_web_web-5b9ffb8fcc-mksj5_billowing-forest-9c081f51-7e9_7cdb1503-aa74-4987-b7fd-cad7d8ceadbc_0
b46296745e85   k8s.gcr.io/pause:3.7   "/pause"                 10 minutes ago   Up 10 minutes                         k8s_POD_web-5b9ffb8fcc-mksj5_billowing-forest-9c081f51-7e9_7cdb1503-aa74-4987-b7fd-cad7d8ceadbc_0
94759e56269b   8c2c38aa676e           "/kube-vpnkit-forwar…"   15 minutes ago   Up 15 minutes                         k8s_vpnkit-controller_vpnkit-controller_kube-system_1b81477c-8f26-4ae8-9d84-6b21159912ba_148
f00702937da1   bf3a2d365684           "/usr/local/bin/acor…"   58 minutes ago   Up 58 minutes                         k8s_acorn-controller_acorn-controller-755dc697cc-8xhkx_acorn-system_3fba524c-6ab9-48e0-a926-19d247e2b9cb_2

... ...

Notice the first 3 have been running for 10-15 minutes, and the others have been running for 58 minutes. Then I looked at the Acorn app again, and I was able to get the ENDPOINT:

$ acorn ps
NAME               IMAGE          HEALTHY   UP-TO-DATE   CREATED   ENDPOINTS                                                 MESSAGE
billowing-forest   aac454dc07a4   1         1            10m ago   http://web.billowing-forest.local.on-acorn.io => web:80   OK

So strictly speaking, this problem is no longer there, at least for now. But ideally I'd like to have some pointers on troubleshooting such problems if they appear again. Or perhaps the Acorn team can consider building such diagnostics into your tool set.

stevel032 avatar Aug 08 '22 07:08 stevel032

Improving this is at the top of our priority list. I might send you some questions later to help assess exactly what might have gone wrong

cjellick avatar Aug 08 '22 15:08 cjellick

This is an odd one. I haven't observed the endpoint coming and going like that. It's like the ingress is being flaky or something.

Do you continue to see this?

@ibuildthecloud any thoughts on debug steps?

cjellick avatar Aug 09 '22 15:08 cjellick

I am not seeing this any more, and was able to move on to the next steps in the "Getting Started" section and able to bring up the Python app and see the counter value going up. It was pretty smooth, good job guys...

Now I forgot to attach one more piece of information from yesterday. When Docker Desktop reports that one of the container is dead, I was able to see a single line of log output:

Error: Get "https://10.96.0.1:443/api?timeout=32s": dial tcp 10.96.0.1:443: i/o timeout

It looks like some internal request to a 10.x host failed. The name of the container was this:

k8s_acorn-controller_acorn-controller-755dc697cc-8xhkx_acorn-system_3fba524c-6ab9-48e0-a926-19d247e2b9cb_1

In any case, I'll report back if the same problem occurs again.

stevel032 avatar Aug 09 '22 18:08 stevel032

Thanks for the update

cjellick avatar Aug 10 '22 15:08 cjellick

I'm being using Acorn off and on for the past few days and the issue does not pop up any more. I don't know how you guys manage the issues here, but feel free to close it. If the problem occurs again, I'll let you guys know.

stevel032 avatar Aug 18 '22 18:08 stevel032

Will do. Thanks

cjellick avatar Aug 18 '22 19:08 cjellick