podman
podman copied to clipboard
HealthCheck log output options
This PR creates three new flags that can affect the output of the HealtCheck log.
Currently, when a container is configured with HealthCheck, the output from the HealthCheck command is only logged to the container status file, which is accessible via podman inspect. It is also limited to the last five executions and the first 500 characters per execution.
This makes debugging past problems very difficult, since the only information available about the failure of the HealthCheck command is the generic healthcheck service failed record.
-
The
--health-log-destinationflag sets the destination of the HealthCheck log.none: (default behavior)HealthCheckResultsare stored in overlay containers. (For example:./run/containers/storage/overlay-containers/<container-ID>/healthcheck.log)directory: creates a log file named<container-ID>-healthcheck.logwith JSONHealthCheckResultsin the specified directory.events_logger: The log will be written with logging mechanism set by events_logger.
-
The
--health-max-log-countflag sets the maximum number of attempts in the HealthCheck log file.- A value of
0indicates an infinite number of attempts in the log file. - The default value is
5attempts in the log file.
- A value of
-
The
--health-max-log-sizeflag sets the maximum length of the log stored.- A value of
0indicates an infinite log length. - The default value is
500log characters.
- A value of
Does this PR introduce a user-facing change?
Added --health-log-destination, --health-max-log-count and --health-max-log-size flags that affect HealtCheck log output.
Fixes: RHEL-24623
@mheon PTAL
@Luap99 PTAL, particularly at the events bits. I don't really mind but we're getting a lot of feedback about how we're handling events.
@Luap99 PTAL
@mheon @Luap99 PTAL, I have resolved the feedback and checked that the defaults are propagated correctly.
@edsantiago @Luap99 PTAL, I've modified the code according to your feedback.
Looking, but, test flakes rootless on my laptop:
✗ |220| podman healthcheck --health-max-log-size infinite value (0) [3485]
...
...very very very very long string
#/vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
#| FAIL: Number of matching health log messages
#| expected: -eq 2
#| actual: 1
#\^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This is caused by the fact that podman run creates and starts the systemd timer right away, which starts the first run of HealtCheck when the container is created. Then the podman healthcheck run is manually triggered again in the test. That's why you can get a second run. However, this depends on systemd. I can avoid this by using the -ge comparison. WDYT? @edsantiago
@edsantiago PTAL, I've modified the tests according to your suggestions.
Ephemeral COPR build failed. @containers/packit-build please check.
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: edsantiago, Honny1, Luap99
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [Luap99,edsantiago]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment