logspout icon indicating copy to clipboard operation
logspout copied to clipboard

Log are not processed after some time

Open matgenois opened this issue 7 years ago • 30 comments

I've subscribed to logdna.com and to get my log to there interface, they made me install logspout with this command line : sudo docker run --name="logdna" --restart=always -d -v=/var/run/docker.sock:/var/run/docker.sock -e SYSLOG_STRUCTURED_DATA="logdna@48950 tag=\"docker\"" gliderlabs/logspout syslog+tls://syslog-a.logdna.com:37555 We use Docker Cloud and AWS to host our node.js RESTful API. Our app log using Bunyan and we are logging to stdout. After some time (few hours), log stop flowing to logdna. When I restart container logspout, everything is fine again. I've seen a closed issue about dockercloud/logrotate would be the cause. logspout docker image Id is 87895b8ba614. it seems like the latest version.

Any ideas? Tell me what you need so I can help you helping me :) Thank you!

Admin EDIT: this case includes docker rotating logs and seems related to #216

matgenois avatar Jun 06 '17 13:06 matgenois

What output do you see from docker logs logspout prior to restarting the container?

michaelshobbs avatar Jun 06 '17 15:06 michaelshobbs

Thanks for your reply! Here what I have :

{"log":"2017/06/06 03:59:09 syslog: write tcp 172.17.0.2:57198-\u003e40.114.47.198:37555: write: connection timed out\n","stream":"stderr","time":"2017-06-06T03:59:09.648844167Z"} {"log":"2017/06/06 03:59:09 syslog: retrying tcp up to 10 times\n","stream":"stderr","time":"2017-06-06T03:59:09.648873473Z"} {"log":"2017/06/06 03:59:30 syslog: retry failed\n","stream":"stderr","time":"2017-06-06T03:59:30.111101136Z"} {"log":"2017/06/06 03:59:30 syslog: reconnecting up to 10 times\n","stream":"stderr","time":"2017-06-06T03:59:30.111161867Z"} {"log":"2017/06/06 03:59:30 syslog: reconnect successful\n","stream":"stderr","time":"2017-06-06T03:59:30.407016144Z"}

matgenois avatar Jun 06 '17 19:06 matgenois

That looks like logspout reconnected to the syslog server. Have you tried sending logs after you see reconnect successful?

michaelshobbs avatar Jun 06 '17 23:06 michaelshobbs

Yes, logs continue to flow in other containers but are not routed by logspout to our syslog server.

matgenois avatar Jun 07 '17 12:06 matgenois

Currently logspout may drop some logs during the reconnect process. Just to confirm, containers are sending logs right after you see the reconnect successful message in the logspout output, correct?

michaelshobbs avatar Jun 07 '17 14:06 michaelshobbs

No. Containers are not sending logs right after the reconnect successful. I must restart logspout to get containers logs.

matgenois avatar Jun 07 '17 15:06 matgenois

Apologies, what I meant to say there was, are logs being generated from other containers immediately after (or within say 5 minutes of) you seeing the reconnect successful output?

michaelshobbs avatar Jun 07 '17 15:06 michaelshobbs

Yes logs are generated from other containers during and immediately after the reconnect successful output. Our others containers are logging a lot of stuff.

matgenois avatar Jun 07 '17 15:06 matgenois

I use papertrail with the new reconnect logic and it does seem to work. Perhaps logdna is doing something different?

michaelshobbs avatar Jun 07 '17 15:06 michaelshobbs

Ah I initially didn't relate #216 to this issue. It should be clarified that this is most likely due to docker's log rotation. Need to verify though.

michaelshobbs avatar Jun 07 '17 16:06 michaelshobbs

@matgenois can you provide your docker daemon options?

michaelshobbs avatar Jun 07 '17 16:06 michaelshobbs

We use the default options from docker cloud. Based on this page (https://success.docker.com/Cloud/Solve/How_do_you_set_custom_docker_daemon_options_in_the_Docker_Cloud_Agent%3F), here's my dockercloud-agent.conf : {"CertCommonName":"2dc92577-669b-4c50-80c7-f03ef4af2631.node.dockerapp.io","DockerHost":"tcp://0.0.0.0:2375","Host":"https://cloud.docker.com/","Token":"3ef633ceb03d4e379d1264256f0d6858","UUID":"2dc92577-669b-4c50-80c7-f03ef4af2631","DockerOpts":""}

matgenois avatar Jun 07 '17 16:06 matgenois

Unfortunately that doesn't show me how your daemon logging is setup. Can you run ps auxwww | grep dockerd | grep -v grep and provide the output?

michaelshobbs avatar Jun 07 '17 16:06 michaelshobbs

I was able to recreate this as outlined in #216 with --log-opt max-size=10k. Logspout just stops sending logs with no error msg.

michaelshobbs avatar Jun 07 '17 16:06 michaelshobbs

Sorry I don't have a good expertise on docker. Your command return nothing but what I can understand is since I run through docker cloud, dockerd is not running in the same way. Maybe this what you want?

root 879 0.0 0.2 521956 9956 ? S<sl May31 3:38 /usr/bin/dockercloud-agent root 1259 0.9 1.2 1098468 48040 ? S<l May31 93:34 /usr/bin/docker daemon -H unix:///var/run/docker.sock -H tcp://0.0.0.0:2375 --userland-proxy=false --tlscert /etc/dockercloud/agent/cert.pem --tlskey /etc/dockercloud/agent/key.pem --tlscacert /etc/dockercloud/agent/ca.pem --tlsverify

matgenois avatar Jun 07 '17 17:06 matgenois

Not familiar with that setup. No worries though. I was able to reproduce the issue with docker's log rotation. Confirmed with both logdna and papertrail so it's definitely a logspout issue.

michaelshobbs avatar Jun 07 '17 17:06 michaelshobbs

duplicate of #255

michaelshobbs avatar Jun 07 '17 17:06 michaelshobbs

Glad you could have reproduced it but I would have liked better that the error would be on my side ;)

If I can help more, I will be happy to!!

matgenois avatar Jun 08 '17 12:06 matgenois

I've just seen this in /var/log/dockercloud/docker.log :

time="2017-06-14T17:45:50.328534625Z" level=warning msg="container f4f5159fe5955527a6d9ebc0d75d43651cca3a2120d5a345b408b7350bb50a15 restart canceled" time="2017-06-14T18:05:47.492101354Z" level=warning msg="container f4f5159fe5955527a6d9ebc0d75d43651cca3a2120d5a345b408b7350bb50a15 restart canceled" time="2017-06-14T18:20:26.228212877Z" level=warning msg="container f4f5159fe5955527a6d9ebc0d75d43651cca3a2120d5a345b408b7350bb50a15 restart canceled"

f4f5159fe... is the container id of logspout

matgenois avatar Jun 14 '17 18:06 matgenois

@michaelshobbs Any idea on how I could detect that logspout is no longer sending log to syslog so I can make a script that will restart logspout container? I'm looking at logspout log and I have no output and in dockercloud.log either.

Do you think theres other logrotate images that would not cause this problem?

matgenois avatar Jun 16 '17 14:06 matgenois

I have found a way to restart logspout after logrotate : https://serverfault.com/questions/856251/launch-script-after-dockercloud-logrotate-rotate-log

matgenois avatar Jun 16 '17 18:06 matgenois

+1 I'm also experiencing this issue. I'm using the ograhl/logspout-restart container atm to restart the logspout container when it the logs are rotated. It isn't ideal, so a real fix for this bug would be welcome :)

joshontheweb avatar Jun 22 '17 21:06 joshontheweb

I think i have a similar problem, can't be sure if it is the same problem. Anyway, restarting works, i will see how to do it.

sfrique avatar Jul 05 '17 18:07 sfrique

Hi, having the same issue, and INACTIVITY_TIMEOUT=XX described in the readme https://github.com/gliderlabs/logspout#detecting-timeouts-in-docker-log-streams fixes everything.

I set it to 30 seconds : INACTIVITY_TIMEOUT=30s and even with really tiny log files, it seems to be ok :

Here is the logging description of services I want to log :

    logging:
      driver: json-file
      options:
        max-size: 1m
        max-file: 1

And the logspout-logstash service :

version: '2'
services:
  logspout:
    image: bekt/logspout-logstash:latest
    environment:
      LOGSPOUT: ignore
      ROUTE_URIS: logstash://logstash:5001
      INACTIVITY_TIMEOUT: 30s
    stdin_open: true
    external_links:
    - elk/logstash:logstash
    volumes:
    - /var/run/docker.sock:/var/run/docker.sock
    tty: true
    logging:
      driver: json-file
      options:
        max-size: 50m
        max-file: 1

The bekt image contains a logstash module and is based on the latest logspout image.

digitalLumberjack avatar Jul 08 '17 09:07 digitalLumberjack

+1

Paveltarno avatar Aug 08 '17 09:08 Paveltarno

Would be cool if this was fixed without the INACTIVITY_TIMEOUT environment variable, any ideas on how we could achieve this?

mtricht avatar Aug 17 '17 15:08 mtricht

  1. Does the INACTIVITY_TIMEOUT fix guarantee no dropped logs, even for the period INACTIVITY_TIMEOUT is set to?
  2. ETA on a more robust fix?

alecbz avatar Sep 14 '17 00:09 alecbz

Seeing the same issue with INACTIVITY_TIMEOUT set to 30s

2018-04-23
22:49:08 # logspout v3.3-dev by gliderlabs
22:49:08 # adapters: raw syslog tcp udp tls
22:49:08 # options : persist:/mnt/routes
22:49:08 # jobs : http[routes,logs]:80 pump routes
22:49:08 # routes :
22:49:08 # ADAPTER	ADDRESS	CONTAINERS	SOURCES	OPTIONS
22:49:08 # syslog+tls	syslog-a.logdna.com:6514	map[]

2018-04-24
06:59:55 06:59:55 syslog: write tcp 172.17.0.2:43816->40.114.41.103:6514: write: broken pipe
06:59:55 06:59:55 syslog: reconnecting up to 10 times
06:59:55 syslog: reconnect successful
07:00:34 syslog: write tcp 172.17.0.2:53288->40.114.40.67:6514: write: broken pipe
07:00:34 syslog: reconnecting up to 10 times
07:00:34 syslog: reconnect successful

Despite this is saying reconnect successful the all process is never resuming. Restarting the all container works... Any kind of health-check would help as well.

kwent avatar Apr 25 '18 19:04 kwent

https://github.com/gliderlabs/logspout/issues/377#issuecomment-537708428

forestjohnsonpeoplenet avatar Oct 02 '19 22:10 forestjohnsonpeoplenet

@matgenois does INACTIVITY_TIMEOUT resolve your issue? Can this be closed?

Hossy avatar Oct 05 '22 00:10 Hossy