puppet-prometheus
puppet-prometheus copied to clipboard
add `timeout_stop` to control systemd `TimeoutStopSec`
Pull Request (PR) description
With larger WAL segments, prometheus fails to write
a new checkpoint in TimeoutStopSec time:
Oct 03 15:13:37 prometheus prometheus[2452]: level=info ts=2020-10-03T15:13:37.751Z caller=checkpoint.go:96 component=tsdb msg="Creating checkpoint" from_segment=85417 to_segment=85677 mint
Oct 03 15:15:06 prometheus systemd[1]: prometheus.service: State 'stop-sigterm' timed out. Killing.
Oct 03 15:15:06 prometheus systemd[1]: prometheus.service: Killing process 2452 (prometheus) with signal SIGKILL.
Oct 03 15:15:06 prometheus systemd[1]: prometheus.service: Main process exited, code=killed, status=9/KILL
Oct 03 15:15:06 prometheus systemd[1]: prometheus.service: Failed with result 'timeout'.
Oct 03 15:15:06 prometheus systemd[1]: Stopped Prometheus Monitoring framework.
This change adds a new parameter to define TimeoutStopSec
for prometheus.service.
This Pull Request (PR) fixes the following issues
No issue previously created.
The required systemd unitfile overwrite could have been made locally as well - but as I suppose that others could also face similar issues, I decided to implement the parameter directly.
@antondollmaier can you please rebase against our latest master branch?
@antondollmaier ping :)
Dear @antondollmaier, thanks for the PR!
This is Vox Pupuli Tasks, your friendly Vox Pupuli GitHub Bot. I noticed that your pull request contains merge conflict. Can you please rebase?
You can find my sourcecode at voxpupuli/vox-pupuli-tasks
I have opened a new issue to change the systemd configuration, which will allow customization like this without adding new parameters. https://github.com/voxpupuli/puppet-prometheus/issues/734 Since this PR has not seen any action for a while I will close it.