logspout
logspout copied to clipboard
Log are not processed after some time
I've subscribed to logdna.com and to get my log to there interface, they made me install logspout with this command line :
sudo docker run --name="logdna" --restart=always -d -v=/var/run/docker.sock:/var/run/docker.sock -e SYSLOG_STRUCTURED_DATA="logdna@48950 tag=\"docker\"" gliderlabs/logspout syslog+tls://syslog-a.logdna.com:37555
We use Docker Cloud and AWS to host our node.js RESTful API. Our app log using Bunyan and we are logging to stdout.
After some time (few hours), log stop flowing to logdna. When I restart container logspout, everything is fine again. I've seen a closed issue about dockercloud/logrotate would be the cause.
logspout docker image Id is 87895b8ba614. it seems like the latest version.
Any ideas? Tell me what you need so I can help you helping me :) Thank you!
Admin EDIT: this case includes docker rotating logs and seems related to #216
What output do you see from docker logs logspout
prior to restarting the container?
Thanks for your reply! Here what I have :
{"log":"2017/06/06 03:59:09 syslog: write tcp 172.17.0.2:57198-\u003e40.114.47.198:37555: write: connection timed out\n","stream":"stderr","time":"2017-06-06T03:59:09.648844167Z"} {"log":"2017/06/06 03:59:09 syslog: retrying tcp up to 10 times\n","stream":"stderr","time":"2017-06-06T03:59:09.648873473Z"} {"log":"2017/06/06 03:59:30 syslog: retry failed\n","stream":"stderr","time":"2017-06-06T03:59:30.111101136Z"} {"log":"2017/06/06 03:59:30 syslog: reconnecting up to 10 times\n","stream":"stderr","time":"2017-06-06T03:59:30.111161867Z"} {"log":"2017/06/06 03:59:30 syslog: reconnect successful\n","stream":"stderr","time":"2017-06-06T03:59:30.407016144Z"}
That looks like logspout reconnected to the syslog server. Have you tried sending logs after you see reconnect successful
?
Yes, logs continue to flow in other containers but are not routed by logspout to our syslog server.
Currently logspout may drop some logs during the reconnect process. Just to confirm, containers are sending logs right after you see the reconnect successful
message in the logspout output, correct?
No. Containers are not sending logs right after the reconnect successful. I must restart logspout to get containers logs.
Apologies, what I meant to say there was, are logs being generated from other containers immediately after (or within say 5 minutes of) you seeing the reconnect successful
output?
Yes logs are generated from other containers during and immediately after the reconnect successful output. Our others containers are logging a lot of stuff.
I use papertrail with the new reconnect logic and it does seem to work. Perhaps logdna is doing something different?
Ah I initially didn't relate #216 to this issue. It should be clarified that this is most likely due to docker's log rotation. Need to verify though.
@matgenois can you provide your docker daemon options?
We use the default options from docker cloud. Based on this page (https://success.docker.com/Cloud/Solve/How_do_you_set_custom_docker_daemon_options_in_the_Docker_Cloud_Agent%3F), here's my dockercloud-agent.conf :
{"CertCommonName":"2dc92577-669b-4c50-80c7-f03ef4af2631.node.dockerapp.io","DockerHost":"tcp://0.0.0.0:2375","Host":"https://cloud.docker.com/","Token":"3ef633ceb03d4e379d1264256f0d6858","UUID":"2dc92577-669b-4c50-80c7-f03ef4af2631","DockerOpts":""}
Unfortunately that doesn't show me how your daemon logging is setup. Can you run ps auxwww | grep dockerd | grep -v grep
and provide the output?
I was able to recreate this as outlined in #216 with --log-opt max-size=10k
. Logspout just stops sending logs with no error msg.
Sorry I don't have a good expertise on docker. Your command return nothing but what I can understand is since I run through docker cloud, dockerd is not running in the same way. Maybe this what you want?
root 879 0.0 0.2 521956 9956 ? S<sl May31 3:38 /usr/bin/dockercloud-agent root 1259 0.9 1.2 1098468 48040 ? S<l May31 93:34 /usr/bin/docker daemon -H unix:///var/run/docker.sock -H tcp://0.0.0.0:2375 --userland-proxy=false --tlscert /etc/dockercloud/agent/cert.pem --tlskey /etc/dockercloud/agent/key.pem --tlscacert /etc/dockercloud/agent/ca.pem --tlsverify
Not familiar with that setup. No worries though. I was able to reproduce the issue with docker's log rotation. Confirmed with both logdna and papertrail so it's definitely a logspout issue.
duplicate of #255
Glad you could have reproduced it but I would have liked better that the error would be on my side ;)
If I can help more, I will be happy to!!
I've just seen this in /var/log/dockercloud/docker.log :
time="2017-06-14T17:45:50.328534625Z" level=warning msg="container f4f5159fe5955527a6d9ebc0d75d43651cca3a2120d5a345b408b7350bb50a15 restart canceled" time="2017-06-14T18:05:47.492101354Z" level=warning msg="container f4f5159fe5955527a6d9ebc0d75d43651cca3a2120d5a345b408b7350bb50a15 restart canceled" time="2017-06-14T18:20:26.228212877Z" level=warning msg="container f4f5159fe5955527a6d9ebc0d75d43651cca3a2120d5a345b408b7350bb50a15 restart canceled"
f4f5159fe... is the container id of logspout
@michaelshobbs Any idea on how I could detect that logspout is no longer sending log to syslog so I can make a script that will restart logspout container? I'm looking at logspout log and I have no output and in dockercloud.log either.
Do you think theres other logrotate images that would not cause this problem?
I have found a way to restart logspout after logrotate : https://serverfault.com/questions/856251/launch-script-after-dockercloud-logrotate-rotate-log
+1 I'm also experiencing this issue. I'm using the ograhl/logspout-restart container atm to restart the logspout container when it the logs are rotated. It isn't ideal, so a real fix for this bug would be welcome :)
I think i have a similar problem, can't be sure if it is the same problem. Anyway, restarting works, i will see how to do it.
Hi, having the same issue, and INACTIVITY_TIMEOUT=XX described in the readme https://github.com/gliderlabs/logspout#detecting-timeouts-in-docker-log-streams fixes everything.
I set it to 30 seconds : INACTIVITY_TIMEOUT=30s
and even with really tiny log files, it seems to be ok :
Here is the logging description of services I want to log :
logging:
driver: json-file
options:
max-size: 1m
max-file: 1
And the logspout-logstash service :
version: '2'
services:
logspout:
image: bekt/logspout-logstash:latest
environment:
LOGSPOUT: ignore
ROUTE_URIS: logstash://logstash:5001
INACTIVITY_TIMEOUT: 30s
stdin_open: true
external_links:
- elk/logstash:logstash
volumes:
- /var/run/docker.sock:/var/run/docker.sock
tty: true
logging:
driver: json-file
options:
max-size: 50m
max-file: 1
The bekt image contains a logstash module and is based on the latest logspout image.
+1
Would be cool if this was fixed without the INACTIVITY_TIMEOUT
environment variable, any ideas on how we could achieve this?
- Does the
INACTIVITY_TIMEOUT
fix guarantee no dropped logs, even for the periodINACTIVITY_TIMEOUT
is set to? - ETA on a more robust fix?
Seeing the same issue with INACTIVITY_TIMEOUT
set to 30s
2018-04-23
22:49:08 # logspout v3.3-dev by gliderlabs
22:49:08 # adapters: raw syslog tcp udp tls
22:49:08 # options : persist:/mnt/routes
22:49:08 # jobs : http[routes,logs]:80 pump routes
22:49:08 # routes :
22:49:08 # ADAPTER ADDRESS CONTAINERS SOURCES OPTIONS
22:49:08 # syslog+tls syslog-a.logdna.com:6514 map[]
2018-04-24
06:59:55 06:59:55 syslog: write tcp 172.17.0.2:43816->40.114.41.103:6514: write: broken pipe
06:59:55 06:59:55 syslog: reconnecting up to 10 times
06:59:55 syslog: reconnect successful
07:00:34 syslog: write tcp 172.17.0.2:53288->40.114.40.67:6514: write: broken pipe
07:00:34 syslog: reconnecting up to 10 times
07:00:34 syslog: reconnect successful
Despite this is saying reconnect successful
the all process is never resuming. Restarting the all container works... Any kind of health-check would help as well.
https://github.com/gliderlabs/logspout/issues/377#issuecomment-537708428
@matgenois does INACTIVITY_TIMEOUT
resolve your issue? Can this be closed?