docker icon indicating copy to clipboard operation
docker copied to clipboard

WIP: make it so that journald logs are ascribed to containers

Open nalind opened this issue 7 years ago • 11 comments

Looking for feedback on this.

Currently, when we log container output to the journal, dockerd is doing the work of forwarding each piece of output to journald. For every piece of data that journald receives over its native socket, it looks up the unit of the process that sent that piece of data, and applies rate limiting logic to all of the data it receives from that unit. As a result, one container spewing massive numbers of log messages can effectively cause another container's output to be squelched due to rate limiting.

This PR modifies dockerd's journald logging driver to add a new process to each container's scope. That process's sole job is to receive messages over a pipe and forward them to the journal. The dockerd journald logging driver starts that process along with the container, feeds it each message, and stops it along with the container.

In my limited testing with journalctl -o export after this is applied, the journal is attributing log messages to the container's scope, so journalctl -u docker-$CONTAINERID.scope should also produce the container's logs.

nalind avatar May 10 '17 21:05 nalind

@vbatts @runcom @mrunalp PTAL

rhatdan avatar May 11 '17 12:05 rhatdan

Hmm, we don't have the "unified" controller on RHEL 7, and I suppose it's better to try to continue after failing to join the container's cgroup than to just exit and let the daemon log pipe-closed errors each time it tries to log some output.

nalind avatar May 11 '17 13:05 nalind

Factored the logging of error messages in the helper, and we're now joining the container's cgroup in all available controllers.

nalind avatar May 11 '17 14:05 nalind

RHEL system level integration testing for https://github.com/projectatomic/docker/commit/de52adfd9b5dc7d733f0f27e78fc1dcaca3a289e- FAIL

Fedora system level integration testing for https://github.com/projectatomic/docker/commit/de52adfd9b5dc7d733f0f27e78fc1dcaca3a289e- FAIL

Log - https://aos-ci.s3.amazonaws.com/projectatomic/docker/projectatomic-docker-integration-tests-prs/27/247-system-level-results.txt

rh-atomic-bot avatar May 11 '17 14:05 rh-atomic-bot

RHEL system level integration testing for https://github.com/projectatomic/docker/commit/927491872c8ff1b2232e9a9349d27b91bbf48b62- FAIL

Fedora system level integration testing for https://github.com/projectatomic/docker/commit/927491872c8ff1b2232e9a9349d27b91bbf48b62- FAIL

Log - https://aos-ci.s3.amazonaws.com/projectatomic/docker/projectatomic-docker-integration-tests-prs/28/247-system-level-results.txt

rh-atomic-bot avatar May 11 '17 15:05 rh-atomic-bot

Hmm, for some reason I'm seeing device-busy errors after exiting the container, which suggests that the log helper is keeping the container's filesystem busy, which is rather unexpected.

nalind avatar May 11 '17 17:05 nalind

RHEL system level integration testing for https://github.com/projectatomic/docker/commit/b5077e115714f5e130678da07cebe459a3bf5eee- FAIL

Fedora system level integration testing for https://github.com/projectatomic/docker/commit/b5077e115714f5e130678da07cebe459a3bf5eee- FAIL

Log - https://aos-ci.s3.amazonaws.com/projectatomic/docker/projectatomic-docker-integration-tests-prs/29/247-system-level-results.txt

rh-atomic-bot avatar May 11 '17 18:05 rh-atomic-bot

Scratch that, I'm seeing that with json-file as well, so it's probably not a bug that's being introduced here.

nalind avatar May 11 '17 18:05 nalind

RHEL system level integration testing for https://github.com/projectatomic/docker/commit/62f4ebc42ee4bbdfc46a05750d8f515955c4e1ca- FAIL

Fedora system level integration testing for https://github.com/projectatomic/docker/commit/62f4ebc42ee4bbdfc46a05750d8f515955c4e1ca- FAIL

Log - https://aos-ci.s3.amazonaws.com/projectatomic/docker/projectatomic-docker-integration-tests-prs/30/247-system-level-results.txt

rh-atomic-bot avatar May 11 '17 19:05 rh-atomic-bot

I didn't read the code, but the concept of this change sounds good to me.

cgwalters avatar May 17 '17 17:05 cgwalters