checkmk icon indicating copy to clipboard operation
checkmk copied to clipboard

Migrated old Nutanix and added some real checks

Open Yogibaer75 opened this issue 2 years ago • 3 comments

Summary

migrated the old Nutanix checks to new check API and added some real needed checks for the Nutanix environment. This step is urgently needed as the currently existing checks have nearly now value for the real production usage. With this migration and added new checks also the agent only outputs JSON, no strange formatting anymore.

If needed i can also provide some agent outputs for testing.


Thank you for your interest in contributing to Checkmk! Unfortunately, due to our current work load, we only consider pure bug fixes as stated in our Readme. This means any new pull request that is not a pure bug fix will be closed. Instead of creating a PR, please consider sharing new check plugins, agent plugins, special agents or notification plugins via the Checkmk Exchange.

General information

Please give a brief summary of the affected device, software or appliance. Keep in mind that we are experts in monitoring, but we cannot be experts on all supported devices. A little context will help us assess your proposed change.

Proposed changes

Sometimes it is hard for us to assess the quality of a fix. While it may work for you, it is our job to ensure that it works for everybody. These are some ways to help us:

  • What is the expected behavior?
  • What is the observed behavior?
  • If it's not obvious from the above: In what way does your patch change the current behavior?
  • Consider writing a unit test that would have failed without your fix.
  • Is this a new problem? What made you submit this PR (new firmware, new device, changed device behavior)?

Yogibaer75 avatar Oct 16 '22 20:10 Yogibaer75

A test output and some tests would be very nice to have. Otherwise it is near impossible for me to judge if this works or not.

kain88-de avatar Oct 17 '22 14:10 kain88-de

How is the best way to sent an complete agent output in this case here?

Yogibaer75 avatar Oct 19 '22 16:10 Yogibaer75

GitHub gist if you are comfortable sharing it publicly. If not send me an email [email protected] On Wed 19. Oct 2022 at 18:12, Andreas Döhler @.***> wrote:

How is the best way to sent an complete agent output in this case here?

— Reply to this email directly, view it on GitHub https://github.com/tribe29/checkmk/pull/521#issuecomment-1284257786, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABA2OVV4Y5OJEISAMYYFG2LWEAMVJANCNFSM6AAAAAARGPZK2M . You are receiving this because you commented.Message ID: @.***>

kain88-de avatar Oct 19 '22 16:10 kain88-de

I needed to fix some problems in the "public beta test" inside the forum. This fixes with some tests will be committed the next days.

Yogibaer75 avatar Oct 26 '22 19:10 Yogibaer75

Complete agent output for testing https://gist.github.com/Yogibaer75/144a28a967646b8a90aaa1af56f78c3b

First test also created - is a test like this one acceptable?

Yogibaer75 avatar Nov 12 '22 08:11 Yogibaer75

Thanks for the test. They look good on a quick glance. I'll have more time to have a closer look later this week.

kain88-de avatar Nov 14 '22 09:11 kain88-de

Only one test for the alerts is missing now.

Yogibaer75 avatar Dec 05 '22 11:12 Yogibaer75

What's the update question?

Yogibaer75 avatar Dec 09 '22 12:12 Yogibaer75

Last tests and some extra checks from a pull request on my github page. I hope that is now all. Only minor changes from here with some checks if i can get some more data to test.

Yogibaer75 avatar Dec 20 '22 14:12 Yogibaer75

Merging this somewhat cleanly into our internal git repo will take some work. I can do that if you do not add more changes.

kain88-de avatar Dec 23 '22 21:12 kain88-de

Merging this somewhat cleanly into our internal git repo will take some work. I can do that if you do not add more changes.

You can give me also some advise how i can do this PR a better way, that it is not so "painful" to merge it.

Yogibaer75 avatar Dec 24 '22 13:12 Yogibaer75

Not much you can do. In the future try to avoid merges with master. For now I think I found a way that is straight forward. I still have to test it though.

kain88-de avatar Dec 25 '22 22:12 kain88-de

Now i have a way to apply this patch that is straight forward. Previously I applied the individual commits with this alias

fpr = !"f() { curl \"https://patch-diff.githubusercontent.com/raw/tribe29/checkmk/pull/$1.patch\" | git am  -3 -; }; f"

Git did not like this since the master branch has progressed a lot since the start of this PR. Github thankfully also allows to get a single diff for the PR. With this new alias I can apply the diff and fix the merge-conflict cmk/utils/man_pages.py.

fprdiff = !"f() { curl \"https://patch-diff.githubusercontent.com/raw/tribe29/checkmk/pull/$1.diff\" | git am  -3 -; }; f"

kain88-de avatar Dec 26 '22 20:12 kain88-de

Sorry for being quite on this for a month. I'm prepping this now to be included in the master and 2.1 branches. Can you give me a list of added checks for the werk.

kain88-de avatar Jan 30 '23 08:01 kain88-de

The following checks are now included. Nutanix cluster

  • alerts - as before
  • IO / CPU / Mem usage of the whole cluster
  • storage container - as before
  • storage pools - as before
  • info - as before
  • protection domain status
  • remote support status
  • VM state - like the ESX VM state on vCenter server

Nutanix hosts

  • disks - status of hardware disks inside every host
  • IO / CPU / Mem usage of every host
  • storage usage of every host

Nutanix VM

  • IO / CPU / Mem usage of every VM
  • VM tool status
  • VM configuration

Yogibaer75 avatar Feb 09 '23 15:02 Yogibaer75

I merged this PR now. Thanks for all the effort.

This change will be release with 2.3.0. We have already started beta testing of 2.2.0 internally and decided to only merge this into our master branch due to the size of this change.

kain88-de avatar Mar 07 '23 13:03 kain88-de

Ok, then i need to build an mkp for 2.2 to upload to the exchange. How is it handled in 2.2 with "old" classic check plugins? Now it is not possible to replace old checks with new ones without "stupid" error messages and a not working "cmk --debug" command?

Yogibaer75 avatar Mar 07 '23 17:03 Yogibaer75

The local-vs-shipped replacement works on a file by file basis. Placing a file under local/share/check_mk/checks/ will make the corresponding file in share/check_mk/checks/ be ignored. If your local file is empty, then every plugin defined in the shipped file will be gone. You can then add a "new" plugin with the same name and everything should be fine.

mo-ki avatar Mar 07 '23 20:03 mo-ki

@mo-ki i mean not the "normal" replacement for checks. All the new Nutanix checks are plugin API 2.0 checks. All the "old" Nutanix checks also included in the upcomming 2.2 release are 1.6 API checks. If i install my mkp with the 2.0 checks then i cannot use the "--debug" option anymore on such a system without manually removing the shipped old Nutanix checks.

Yogibaer75 avatar Mar 07 '23 20:03 Yogibaer75

I think you need to include the empty files in the mkp. Without the empty file, you cannot replace the old plugin by a new one:

OMD[heute]:~$ cmk -R --debug
Legacy check plugin still exists for check plugin prism_alerts. Please remove legacy plugin.
Traceback (most recent call last):
...

But then, once you nuke the old one by creating an empty file:

OMD[heute]:~$ touch local/share/check_mk/checks/prism_alerts
OMD[heute]:~$ cmk -R --debug
Generating configuration for core (type cmc)...
...
Restarting monitoring core...OK

mo-ki avatar Mar 07 '23 23:03 mo-ki

Ok this will work i think.

Yogibaer75 avatar Mar 08 '23 05:03 Yogibaer75