skaffold icon indicating copy to clipboard operation
skaffold copied to clipboard

skaffold dev does not show live tail logs after running for a while

Open jonog94 opened this issue 4 years ago • 15 comments

Expected behavior

running skaffold dev should live tail logs after running for a while

Actual behavior

After multiple image rebuilds and changes to source code, I don't get live logs when the server receives HTTP requests.

Information

  • Skaffold version: 0b8bd59
  • Operating system: macOS Catalina - version 10.15.5
  • Contents of skaffold.yaml:
apiVersion: skaffold/v2alpha1
kind: Config
build:
  local:
    push: false
  artifacts:
    - image: jon/kludge-client
      context: client
      sync:
        manual:
          - src: 'src/**/*.tsx'
            dest: '/usr/src/app'
          - src: 'src/**/**/*.tsx'
            dest: '/usr/src/app'
          - src: 'src/**/**/*.ts'
            dest: '/usr/src/app'
          - src: 'src/**/**/*.js'
            dest: '/usr/src/app'
          - src: 'src/**/**/*.scss'
            dest: '/usr/src/app'
          - src: '*.js'
            dest: '/usr/src/app'
      docker:
        dockerfile: Dockerfile
        buildArgs: 
          network: host
    - image: jon/kludge-api
      context: api
      sync:
        manual:
          - src: 'resources/views/*.php'
            dest: '/var/www/api'
          - src: 'app/Models/*.php'
            dest: '/var/www/api'
          - src: 'app/Http/Controllers/*.php'
            dest: '/var/www/api'
          - src: 'app/Http/Controllers/**/*.php'
            dest: '/var/www/api'
          - src: 'app/Abstracts/Repository/*.php'
            dest: '/var/www/api'
          - src: 'app/Repository/**/*.php'
            dest: '/var/www/api'
          - src: 'app/Services/*.php'
            dest: '/var/www/api'
          - src: 'app/Rules/*.php'
            dest: '/var/www/api'
          - src: 'app/Http/Middleware/*.php'
            dest: '/var/www/api'
          - src: 'app/Http/Requests/**/*.php'
            dest: '/var/www/api'
          - src: 'routes/*.php'
            dest: '/var/www/api'
          - src: 'database/**/*.php'
            dest: '/var/www/api'
          - src: 'config/*.php'
            dest: '/var/www/api'
          - src: 'app/Compiler/Languages/**/*'
            dest: '/var/www/api'
          - src: 'app/Examples/*.php'
            dest: '/var/www/api'
          - src: 'app/Examples/**/**/*'
            dest: '/var/www/api'
          - src: 'app/Providers/*.php'
            dest: '/var/www/api'
      docker:
        dockerfile: Dockerfile
deploy:
  kubectl:
    manifests:
      - k8-dev/client/deployment-manifest.yaml
      - k8-dev/api/deployment-manifest.yaml
      - k8-dev/client/service-manifest.yaml
      - k8-dev/api/service-manifest.yaml

Here's the k8 deployment for the pod that does not live tail

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api
  namespace: kludge
spec:
  replicas: 1
  selector:
    matchLabels:
      name: api
  template:
    metadata:
      labels:
        name: api
      namespace: kludge
    spec:
      containers:
        - image: jon/kludge-api
          name: api
          ports:
            - containerPort: 80
          imagePullPolicy: Never
          env:
            - name: APP_NAME
              value: "api"
            - name: APP_ENV
              value: "local"
            - name: APP_DEBUG
              value: "true"
            - name: APP_LOG_LEVEL
              value: "debug"
            - name: APP_URL
              value: "http://kludge.info"
            - name: LOG_CHANNEL
              value: "stack"
            - name: DB_CONNECTION
              value: "mysql"
            - name: DB_HOST
              value: "mysql"
            - name: DB_DATABASE
              value: "kludgedb"
            - name: DB_USERNAME
              value: "kludgedbuser"
            - name: DB_PASSWORD
              value: "secret"

Steps to reproduce the behavior

Running skaffold dev but usually after 2 days or so live tail logs does not work, but the cluster is still running and everything else works.

jonog94 avatar Aug 05 '20 13:08 jonog94

Is there a specific amount of time until log tailing stops working? Like a timeout that gets set? It seems like it's 10 hours or so from me until my pods don't redirect stdout to terminal.

jonog94 avatar Aug 06 '20 15:08 jonog94

Thanks for reporting this. I did a quick scan of our code and didn't see anything that would look like us setting any timeouts or deadline on contexts for log tailing. Can you rerun skaffold dev -v DEBUG --tail and see if there's anything obvious in the logs? Otherwise if you can, share them with us. Otherwise share a small project that can repro this behavior.

gsquared94 avatar Aug 07 '20 07:08 gsquared94

The pod watcher timeout is set to 100 yrs (watcher.go)

var forever int64 = 3600 * 24 * 365 * 100

gsquared94 avatar Aug 07 '20 07:08 gsquared94

Thanks @gsquared94 I will run it with that command. I realised that when skaffold rebuilds the image, log tailing starts working again. It just seems that when it's the same running pod (for over 10 hours) the log streaming does not show. It's hard to determine if the issue is my OS, Minikube or skaffold but I thought I'd start here.

jonog94 avatar Aug 07 '20 13:08 jonog94

Running in debug mode shows me that the underlying command is kubectl --context minikube logs --since=3s -f api-66fdc779d4-b992n -c api --namespace kludge when it sets up the log streaming.

When I ran this command individually it ran for 10 hrs before disconnecting (no message was given after dc). I still don't know what causes the underlying issue. Based on my research it could be the following:

  1. Apparently log rotation disconnects log streaming (https://github.com/kubernetes/kubernetes/issues/28369). I'm not sure though if it's something that's being done by default in a minikube environment.

  2. There still could be something that skaffold does that interrupts log streaming as I had skaffold running in the background. Maybe it could be a non related rebuild from a mutually exclusive pod that interrupts the log stream.

  3. I have an nginx ingress controller running but based on very different disconnect times I don't think it's a timeout issue coming from this end.

  4. Could just be my operating system, I'm always using 14/16g's of ram, maybe it disconnects long running processes? Although I don't know enough about this to conclude anything.

jonog94 avatar Aug 08 '20 22:08 jonog94

@gsquared94 I've posted an issue to kubernetes here https://github.com/kubernetes/kubernetes/issues/93819

Log streams get closed off eventually, should skaffold have a way of reconnecting dropped connections?

jonog94 avatar Aug 10 '20 03:08 jonog94

we'd want to be careful that we can distinguish between a preliminary failure when first attaching the logger vs. a timeout later on, but I think it would be relatively straightforward to have an "error watcher" that lives inside the pod watcher, and kicks off a retry loop whenever it sees recurring failures from kubectl.

probably not something our team will prioritize in the near term, but contributions would be welcome!

nkubala avatar Aug 10 '20 23:08 nkubala

I want to point out that I came her for this reason but found out it was due to the kubens being changed. Hopefully someone else find this useful.

After an entire day of working on a project, towards the end of the night I noticed that I was no longer getting tail logs from dev and it was getting really difficult to work with. It started creating issues mentally that I didn't believe skaffold was working at this point (even though I just had bugs). At 12:30AM I realized earlier in the night I switched to a different namespace with kubens to look at some pods and never switched back, so after switching back to the namespace that my skaffold was working on the tailing logs came back.

cphillips83 avatar Apr 07 '21 04:04 cphillips83

I am seeing log tailing fail sometimes because it looks like skaffold is not streaming log from the correct pod. I used the suggestion above to start skaffold with verbose logging (i.e. skaffold dev -v DEBUG --tail)

When I modify a source file, skaffold rebuilds the image, kills the old pod and starts a new one. I the detailed logs I see lines like this: INFO[0120] Streaming logs from pod: app-backend-5b44dd6db6-rl4tb container: app-backend subtask=-1 task=DevLoop

The issue is that sometimes this does not match the pod that is actually running.

$ kubectl get pods
NAME                             READY   STATUS    RESTARTS   AGE
app-backend-86c77cfcd8-lsrxh   1/1     Running   0          9m2s

When the pod names match, tailing works. When they don't match it doesn't. Seems to be a race condition in starting the pod and skaffold setting up the log streaming. I'm not sure if this is the same issue identified by the OP or not.

My config: skaffold v1.31.0 minikube v1.23.1 hyperkit v0.20210107-9-gacbc2d kubectl v1.22.1 MacOS BigSur (x86)

ezrabowman avatar Sep 22 '21 19:09 ezrabowman

I am seeing the same issue. The rebuilding of the containers works for me to restore log output but obviously that isn't ideal.

Yamaha32088 avatar Oct 01 '21 13:10 Yamaha32088

I am in the same boat. Skaffold simply stops logging to console after some time.

FYI, I am not even touching any files. I have a nodejs application that is importing lots of data and logs the current import status to console every 10 second. After two hours or so, logging stops while the application is still doing its work as I can see from the database activity. So that Skaffold stops the logging has nothing to do with file syncing imho. Rebuilding the application will make logging work again - for some time.

greets -act

PS: I use Docker Desktop on Mac with enabled Kubernetes support.

actraiser avatar Nov 03 '21 11:11 actraiser

Logs are printed only on first deploy. After the container has been rebuilt no logs are printed. If I manually delete the pod logs are once again printed to stdout but only until the container is rebuilt.

I've tested this with version 1.35.0 and 1.34.0.

tonsV2 avatar Nov 20 '21 16:11 tonsV2

Same as @tonsV2 -- Experiencing issue in my application -- logs correctly stream after the first invocation of skaffold dev but logs do not stream after I make a change and Skaffold rebuilds and redeploys.

Skaffold 1.35.1

SkylerLutz avatar Jan 18 '22 22:01 SkylerLutz

Also seeing this issue - skaffold v1.37.0 (and prior) - logs eventually stop outputting after several hours.

Seen this on Rancher Desktop on Windows and OSX, Docker Desktop on Windows and OSX.

erulabs avatar Apr 05 '22 17:04 erulabs

I'm seeing the same issue that @tonsV2 and @SkylerLutz are describing, I'm on v1.39.1 installed from brew on OSx 11.6.8. The first deploy shows logs properly, if the dev cycle is triggered logs stop flowing

jbresciani avatar Aug 10 '22 23:08 jbresciani

This seems related to k8s log rotation https://github.com/kubernetes/kubernetes/issues/28369, however this issue is not fixed, skaffold is just using kubectl logs -f for streaming logs, we might not be able to do anything with this issue.

ericzzzzzzz avatar Dec 12 '22 19:12 ericzzzzzzz

Closing this out, as there is nothing we can do about it.

ericzzzzzzz avatar Jan 09 '23 19:01 ericzzzzzzz

Someone in the issue thread there mentioned a workaround: https://github.com/kubernetes/kubernetes/issues/28369#issuecomment-1317380621

Haven't looked at it closely but perhaps we could do something similar (re-opening for now given this)

aaron-prindle avatar Jan 10 '23 21:01 aaron-prindle