netdata
netdata copied to clipboard
Assorted cleanup in the OpenRC init script.
Summary
- Get rid of
NETDATA_START_AFTER_SERVICES. OpenRC already provides a robust configuration system for managing service dependencies, including overrides. The variable itself thus provides no real benefit but makes the script somewhat harder to read, so this PR removes it and just sets the defaultafterlist to what we had been using as the default value of the variable. - Add less ambiguous and more accurate descriptions for the
reload,rotate, andsavecommands. This way they concretely describe exactly what they are doing using standard terminology typically seen in other OpenRC scripts. - Add the ability to use
netdataclifor the above mentioned commands instead ofstart-stop-daemon. This simplifies things for users who want to run the agent under OpenRC’s native process supervision mechanism (supervise-daemon), which does not allow use ofstart-stop-daemonfor such functionality. This is controlled by a new variable calledNETDATA_USE_NETDATACLIvariable, which, if set to 1, usesnetdatacli. The variable defaults to a value of0, preserving the existing behavior as the default behavior.
Test Plan
Basic testing involves installing from this PR branch on a system that uses OpenRC (such as Alpine Linux or a default Gentoo install) and verifying that the script still works. Note that this cannot be done properly in a Docker container.
Additional Information
This fixes three of the four most glaring issues with our OpenRC init script. The final issue is our need net dependency, which is generally considered a bad thing. Properly resolving that is far more complicated though and will be handled in a separate PR.
For users: How does this change affect me?
NETDATA_START_AFTER_SERVICESis no longer supported. Instead use `rc_after` in `/etc/conf.d/netdata` like you would for any other service.NETDATA_USE_NETDATACLIcan now be set to a value of 1 to avoid using start-stop-daemon, simplifying usage of Netdata with OpenRC’s native process supervision.
Clarification needed. Should netdata-agent be built from source (this branch) on Alpine host?
Clarification needed. Should netdata-agent be built from source (this branch) on Alpine host?
That, or build a static install archive from this branch and install using that on the Alpine host.
Is this normal?
--- Installing (but not enabling) the netdata updater tool ---
./netdata-installer.sh: line 1862: cleanup_old_netdata_updater: not found
FAILED Cannot cleanup old netdata updater tool.
Is this normal?
--- Installing (but not enabling) the netdata updater tool --- ./netdata-installer.sh: line 1862: cleanup_old_netdata_updater: not found FAILED Cannot cleanup old netdata updater tool.
Nope, that’s an unrelated bug, I’ll try and have a PR up to fix it by the end of the day today.
Ok then.
What exactly is supposed by 'the script still works'?
Ok then.
What exactly is supposed by 'the script still works'?
Running still starts and stops the agent correctly.
Ideally we want to also make sure that the reload, rotate, and save commands still work as well.
On the whole, it works on Alpine 3. But:
netdata-openrc is not copied to /etc/init.d during install; I had to copy it manually
In case you stops the agent with netdatacli, its status is 'crashed':
alpine:~/netdata# /usr/sbin/netdatacli shutdown-agent
alpine:~/netdata# service netdata status
* status: crashed
But it can be cleanly stopped using service command.
Also, the only log file is the /var/log/netdata/error.log
On the whole, it works on Alpine 3. But:
netdata-openrc is not copied to /etc/init.d during install; I had to copy it manually
This is unexpected, but unrelated to this PR.
In case you stops the agent with
netdatacli, its status is 'crashed':alpine:~/netdata# /usr/sbin/netdatacli shutdown-agent alpine:~/netdata# service netdata status * status: crashed
This is expected behavior for OpenRC. It has no way to know what caused the service to exit, so any exit not mediated by OpenRC itself is listed as a crash.
But it can be cleanly stopped using
servicecommand.
Also expected behavior with OpenRC, if it can’t find the service to kill it when asked to shut down, it assumes it’s already down and updates state accordingly.
Also, the only log file is the
/var/log/netdata/error.log
I think that is expected behavior from the agent itself.
Hey.
As @iigorkarpov , I noticed:
--- Install netdata at system init ---
0
0
0
WARNING Could not determine what type of init script to install on this system.
FAILED Cannot install netdata init service.
But furthermore, I noticed I don't have a netdata-openrc file in system/ (just netdata-openrc.in).
Edit: This is with kickstart installing a pre-compiled binary in /opt/. Will try compiling.
It doesn't seem to produce a "compiled" netdata-openrc file, even from building sources...
It doesn't seem to produce a "compiled" netdata-openrc file, even from building sources...
Indeed, we appear to be having issues relating to this aside from the cleanup code here. I’ll start digging into this further and try to get that fixed, but it should probably be a separate PR.
Rebased to pick up CI changes.
Also added further improvements based on local experimentation with the script.
Rebased to pick up latest changes to system service install handling, as well as latest CI changes.
Was originally going to merge this after the v1.37.0 release, but given that that’s been delayed by a week, merging it now so it’s in the release.