checkmk
checkmk copied to clipboard
Migrated old Nutanix and added some real checks
Summary
migrated the old Nutanix checks to new check API and added some real needed checks for the Nutanix environment. This step is urgently needed as the currently existing checks have nearly now value for the real production usage. With this migration and added new checks also the agent only outputs JSON, no strange formatting anymore.
If needed i can also provide some agent outputs for testing.
Thank you for your interest in contributing to Checkmk! Unfortunately, due to our current work load, we only consider pure bug fixes as stated in our Readme. This means any new pull request that is not a pure bug fix will be closed. Instead of creating a PR, please consider sharing new check plugins, agent plugins, special agents or notification plugins via the Checkmk Exchange.
General information
Please give a brief summary of the affected device, software or appliance. Keep in mind that we are experts in monitoring, but we cannot be experts on all supported devices. A little context will help us assess your proposed change.
Proposed changes
Sometimes it is hard for us to assess the quality of a fix. While it may work for you, it is our job to ensure that it works for everybody. These are some ways to help us:
- What is the expected behavior?
- What is the observed behavior?
- If it's not obvious from the above: In what way does your patch change the current behavior?
- Consider writing a unit test that would have failed without your fix.
- Is this a new problem? What made you submit this PR (new firmware, new device, changed device behavior)?
A test output and some tests would be very nice to have. Otherwise it is near impossible for me to judge if this works or not.
How is the best way to sent an complete agent output in this case here?
GitHub gist if you are comfortable sharing it publicly. If not send me an email [email protected] On Wed 19. Oct 2022 at 18:12, Andreas Döhler @.***> wrote:
How is the best way to sent an complete agent output in this case here?
— Reply to this email directly, view it on GitHub https://github.com/tribe29/checkmk/pull/521#issuecomment-1284257786, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABA2OVV4Y5OJEISAMYYFG2LWEAMVJANCNFSM6AAAAAARGPZK2M . You are receiving this because you commented.Message ID: @.***>
I needed to fix some problems in the "public beta test" inside the forum. This fixes with some tests will be committed the next days.
Complete agent output for testing https://gist.github.com/Yogibaer75/144a28a967646b8a90aaa1af56f78c3b
First test also created - is a test like this one acceptable?
Thanks for the test. They look good on a quick glance. I'll have more time to have a closer look later this week.
Only one test for the alerts is missing now.
What's the update question?
Last tests and some extra checks from a pull request on my github page. I hope that is now all. Only minor changes from here with some checks if i can get some more data to test.
Merging this somewhat cleanly into our internal git repo will take some work. I can do that if you do not add more changes.
Merging this somewhat cleanly into our internal git repo will take some work. I can do that if you do not add more changes.
You can give me also some advise how i can do this PR a better way, that it is not so "painful" to merge it.
Not much you can do. In the future try to avoid merges with master. For now I think I found a way that is straight forward. I still have to test it though.
Now i have a way to apply this patch that is straight forward. Previously I applied the individual commits with this alias
fpr = !"f() { curl \"https://patch-diff.githubusercontent.com/raw/tribe29/checkmk/pull/$1.patch\" | git am -3 -; }; f"
Git did not like this since the master branch has progressed a lot since the start of this PR.
Github thankfully also allows to get a single diff for the PR. With this new alias I can apply the diff and fix the merge-conflict cmk/utils/man_pages.py
.
fprdiff = !"f() { curl \"https://patch-diff.githubusercontent.com/raw/tribe29/checkmk/pull/$1.diff\" | git am -3 -; }; f"
Sorry for being quite on this for a month. I'm prepping this now to be included in the master and 2.1 branches. Can you give me a list of added checks for the werk.
The following checks are now included. Nutanix cluster
- alerts - as before
- IO / CPU / Mem usage of the whole cluster
- storage container - as before
- storage pools - as before
- info - as before
- protection domain status
- remote support status
- VM state - like the ESX VM state on vCenter server
Nutanix hosts
- disks - status of hardware disks inside every host
- IO / CPU / Mem usage of every host
- storage usage of every host
Nutanix VM
- IO / CPU / Mem usage of every VM
- VM tool status
- VM configuration
I merged this PR now. Thanks for all the effort.
This change will be release with 2.3.0. We have already started beta testing of 2.2.0 internally and decided to only merge this into our master branch due to the size of this change.
Ok, then i need to build an mkp for 2.2 to upload to the exchange. How is it handled in 2.2 with "old" classic check plugins? Now it is not possible to replace old checks with new ones without "stupid" error messages and a not working "cmk --debug" command?
The local-vs-shipped replacement works on a file by file basis. Placing a file under local/share/check_mk/checks/
will make the corresponding file in share/check_mk/checks/
be ignored. If your local
file is empty, then every plugin defined in the shipped file will be gone.
You can then add a "new" plugin with the same name and everything should be fine.
@mo-ki i mean not the "normal" replacement for checks. All the new Nutanix checks are plugin API 2.0 checks. All the "old" Nutanix checks also included in the upcomming 2.2 release are 1.6 API checks. If i install my mkp with the 2.0 checks then i cannot use the "--debug" option anymore on such a system without manually removing the shipped old Nutanix checks.
I think you need to include the empty files in the mkp. Without the empty file, you cannot replace the old plugin by a new one:
OMD[heute]:~$ cmk -R --debug
Legacy check plugin still exists for check plugin prism_alerts. Please remove legacy plugin.
Traceback (most recent call last):
...
But then, once you nuke the old one by creating an empty file:
OMD[heute]:~$ touch local/share/check_mk/checks/prism_alerts
OMD[heute]:~$ cmk -R --debug
Generating configuration for core (type cmc)...
...
Restarting monitoring core...OK
Ok this will work i think.