docker
docker copied to clipboard
WIP: make it so that journald logs are ascribed to containers
Looking for feedback on this.
Currently, when we log container output to the journal, dockerd is doing the work of forwarding each piece of output to journald. For every piece of data that journald receives over its native socket, it looks up the unit of the process that sent that piece of data, and applies rate limiting logic to all of the data it receives from that unit. As a result, one container spewing massive numbers of log messages can effectively cause another container's output to be squelched due to rate limiting.
This PR modifies dockerd's journald
logging driver to add a new process to each container's scope. That process's sole job is to receive messages over a pipe and forward them to the journal. The dockerd journald
logging driver starts that process along with the container, feeds it each message, and stops it along with the container.
In my limited testing with journalctl -o export
after this is applied, the journal is attributing log messages to the container's scope, so journalctl -u docker-$CONTAINERID.scope
should also produce the container's logs.
@vbatts @runcom @mrunalp PTAL
Hmm, we don't have the "unified" controller on RHEL 7, and I suppose it's better to try to continue after failing to join the container's cgroup than to just exit and let the daemon log pipe-closed errors each time it tries to log some output.
Factored the logging of error messages in the helper, and we're now joining the container's cgroup in all available controllers.
RHEL system level integration testing for https://github.com/projectatomic/docker/commit/de52adfd9b5dc7d733f0f27e78fc1dcaca3a289e- FAIL
Fedora system level integration testing for https://github.com/projectatomic/docker/commit/de52adfd9b5dc7d733f0f27e78fc1dcaca3a289e- FAIL
Log - https://aos-ci.s3.amazonaws.com/projectatomic/docker/projectatomic-docker-integration-tests-prs/27/247-system-level-results.txt
RHEL system level integration testing for https://github.com/projectatomic/docker/commit/927491872c8ff1b2232e9a9349d27b91bbf48b62- FAIL
Fedora system level integration testing for https://github.com/projectatomic/docker/commit/927491872c8ff1b2232e9a9349d27b91bbf48b62- FAIL
Log - https://aos-ci.s3.amazonaws.com/projectatomic/docker/projectatomic-docker-integration-tests-prs/28/247-system-level-results.txt
Hmm, for some reason I'm seeing device-busy errors after exiting the container, which suggests that the log helper is keeping the container's filesystem busy, which is rather unexpected.
RHEL system level integration testing for https://github.com/projectatomic/docker/commit/b5077e115714f5e130678da07cebe459a3bf5eee- FAIL
Fedora system level integration testing for https://github.com/projectatomic/docker/commit/b5077e115714f5e130678da07cebe459a3bf5eee- FAIL
Log - https://aos-ci.s3.amazonaws.com/projectatomic/docker/projectatomic-docker-integration-tests-prs/29/247-system-level-results.txt
Scratch that, I'm seeing that with json-file as well, so it's probably not a bug that's being introduced here.
RHEL system level integration testing for https://github.com/projectatomic/docker/commit/62f4ebc42ee4bbdfc46a05750d8f515955c4e1ca- FAIL
Fedora system level integration testing for https://github.com/projectatomic/docker/commit/62f4ebc42ee4bbdfc46a05750d8f515955c4e1ca- FAIL
Log - https://aos-ci.s3.amazonaws.com/projectatomic/docker/projectatomic-docker-integration-tests-prs/30/247-system-level-results.txt
I didn't read the code, but the concept of this change sounds good to me.