cloudstack
cloudstack copied to clipboard
Prometheus exporter enhancement
Description
In this pull request, I added new functionality to Cloudstack prometheus exporter. To see the differences please check the testing section.
Types of changes
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] Enhancement (improves an existing feature and functionality)
- [ ] Cleanup (Code refactoring and cleanup, that may add test cases)
How Has This Been Tested?
This pull request contains seven commits. Except for the dfb35e5224 commit, they are all added new functionality to the Prometheus. In the subsequent sections, I will describe every commit functionality. I tested them in my test environment with three management servers, one DB node (MySQL), and two KVM hypervisor.
1. Export count of total/up/down hosts by tags 0dbe9e78a3660bef73451c6d56f4826509833f2b
- Enable Prometheus.
- Add tag to the host.
- Run
curl http://127.0.0.1:9595/metrics | grep cloudstack_hosts_total
Output Before Changes:
cloudstack_hosts_total{zone="mgt122-60",filter="online"} 2
cloudstack_hosts_total{zone="mgt122-60",filter="offline"} 0
cloudstack_hosts_total{zone="mgt122-60",filter="total"} 2
Output After Changes:
cloudstack_hosts_total{zone="mgt122-60",filter="online"} 2
cloudstack_hosts_total{zone="mgt122-60",filter="offline"} 0
cloudstack_hosts_total{zone="mgt122-60",filter="total"} 2
cloudstack_hosts_total{zone="mgt122-60",filter="total",tags="tage1"} 1
cloudstack_hosts_total{zone="mgt122-60",filter="online",tags="tage1"} 1
cloudstack_hosts_total{zone="mgt122-60",filter="offline",tags="tage1"} 0
2. Export count of vms by state and host tag e6a81d16d9f11db6bb4fd2b0ab38194961ce516b
- Enable Prometheus.
- Add tag to the host.
- Run
curl http://127.0.0.1:9595/metrics | grep cloudstack_vms_total_by_tag
After changes, the following line added to the Prometheus output:
cloudstack_vms_total_by_tag{zone="mgt122-60",filter="starting",tags="tage1"} 0
cloudstack_vms_total_by_tag{zone="mgt122-60",filter="running",tags="tage1"} 0
cloudstack_vms_total_by_tag{zone="mgt122-60",filter="stopping",tags="tage1"} 0
cloudstack_vms_total_by_tag{zone="mgt122-60",filter="stopped",tags="tage1"} 0
cloudstack_vms_total_by_tag{zone="mgt122-60",filter="destroyed",tags="tage1"} 0
cloudstack_vms_total_by_tag{zone="mgt122-60",filter="expunging",tags="tage1"} 0
cloudstack_vms_total_by_tag{zone="mgt122-60",filter="migrating",tags="tage1"} 0
cloudstack_vms_total_by_tag{zone="mgt122-60",filter="error",tags="tage1"} 0
cloudstack_vms_total_by_tag{zone="mgt122-60",filter="unknown",tags="tage1"} 0
cloudstack_vms_total_by_tag{zone="mgt122-60",filter="shutdown",tags="tage1"} 0
3. Add host tags to host cpu/cores/memory usage in Prometheus exporter eefd9f197352653f74aff73ccfffc4dd86d56b0d
- Enable Prometheus.
- Add tag to the host.
- Run following command and justify output with the expected results.
curl http://127.0.0.1:9595/metrics | grep cloudstack_host_vms_cores_total
- repeat step three for
cloudstack_host_cpu_usage_mhz_total
andcloudstack_host_memory_usage_mibs_total
Output Before Changes:
cloudstack_host_vms_cores_total\{zone="mgt122-60",hostname="node75",ip="10.135.122.75",filter="used",dedicated="0"} 2
cloudstack_host_vms_cores_total\{zone="mgt122-60",hostname="node75",ip="10.135.122.75",filter="total",dedicated="0"} 4
cloudstack_host_vms_cores_total\{zone="mgt122-60",hostname="node74",ip="10.135.122.74",filter="used",dedicated="0"} 2
cloudstack_host_vms_cores_total\{zone="mgt122-60",hostname="node74",ip="10.135.122.74",filter="total",dedicated="0"} 4
cloudstack_host_vms_cores_total\{zone="mgt122-60",filter="allocated"} 4
cloudstack_host_vms_cores_total_by_tag\{zone="mgt122-60",filter="allocated",tags="tage1"} 0
Output After Changes:
cloudstack_host_vms_cores_total\{zone="mgt122-60",hostname="node75",ip="10.135.122.75",filter="used",dedicated="0",tags="tage1"} 2
cloudstack_host_vms_cores_total\{zone="mgt122-60",hostname="node75",ip="10.135.122.75",filter="total",dedicated="0",tags="tage1"} 4
cloudstack_host_vms_cores_total\{zone="mgt122-60",hostname="node74",ip="10.135.122.74",filter="used",dedicated="0",tags=""} 2
cloudstack_host_vms_cores_total\{zone="mgt122-60",hostname="node74",ip="10.135.122.74",filter="total",dedicated="0",tags=""} 4
cloudstack_host_vms_cores_total\{zone="mgt122-60",filter="allocated"} 4
4. Cloudstack Prometheus exporter: Add allocated capacity group by host tag. a489e3c6b269279df5fbff32a708d9ed0296a40e
- Enable Prometheus.
- Add tag to the host.
- Run
curl http://127.0.0.1:9595/metrics | grep cloudstack_host_vms_cores_total
Output Before Changes:
cloudstack_host_vms_cores_total\{zone="mgt122-60",hostname="node75",ip="10.135.122.75",filter="used",dedicated="0"} 2
cloudstack_host_vms_cores_total\{zone="mgt122-60",hostname="node75",ip="10.135.122.75",filter="total",dedicated="0"} 4
cloudstack_host_vms_cores_total\{zone="mgt122-60",hostname="node74",ip="10.135.122.74",filter="used",dedicated="0"} 2
cloudstack_host_vms_cores_total\{zone="mgt122-60",hostname="node74",ip="10.135.122.74",filter="total",dedicated="0"} 4
cloudstack_host_vms_cores_total\{zone="mgt122-60",filter="allocated"} 4
cloudstack_host_vms_cores_total_by_tag\{zone="mgt122-60",filter="allocated",tags="tage1"} 0
Output After Changes:
cloudstack_host_vms_cores_total\{zone="mgt122-60",hostname="node75",ip="10.135.122.75",filter="used",dedicated="0",tags="tage1"} 2
cloudstack_host_vms_cores_total\{zone="mgt122-60",hostname="node75",ip="10.135.122.75",filter="total",dedicated="0",tags="tage1"} 4
cloudstack_host_vms_cores_total\{zone="mgt122-60",hostname="node74",ip="10.135.122.74",filter="used",dedicated="0",tags=""} 2
cloudstack_host_vms_cores_total\{zone="mgt122-60",hostname="node74",ip="10.135.122.74",filter="total",dedicated="0",tags=""} 4
cloudstack_host_vms_cores_total\{zone="mgt122-60",filter="allocated"} 4
cloudstack_host_vms_cores_total_by_tag\{zone="mgt122-60",filter="allocated",tags="tage1"} 0
5. Show count of Active domains on grafana de08479da13b7b3f3eb467fc3798c6734f0e6fb7
============== Scenario One ==============
- Enable Prometheus.
- Run
curl http://127.0.0.1:9595/metrics | grep cloudstack_active_domains_total
. Output is:
cloudstack_active_domains_total{zone="mgt122-60"} 1
- Create a new domain
- Repeat step two. The output will not change.
- Add a new account to the domain created in step three.
- Repeat step two. The output will change to:
cloudstack_active_domains_total{zone="mgt122-60"} 2
============== Scenario Two ==============
- Use previous environment
- Disable all account in domain created in step 3 of Scenario one.
- Repeat step two of Scenario one. The output will change to:
cloudstack_active_domains_total{zone="mgt122-60"} 1
6. Show count of Active accounts and vms by size on grafana d7aa19f0f850dfd5eea5c4f51a6529d39c2daf88
============== Scenario One ==============
- Enable Prometheus.
- Run
curl http://127.0.0.1:9595/metrics | grep cloudstack_active_accounts_total
. output is:
cloudstack_active_accounts_total{zone="mgt122-60"} 1
- Create a new account
- Repeat step two. The output will change to:
cloudstack_active_accounts_total\{zone="mgt122-60"} 2
============== Scenario Two ==============
- Enable Prometheus.
- Run
curl http://127.0.0.1:9595/metrics | grep cloudstack_vms_total_by_size
. output is:
cloudstack_vms_total_by_size\{zone="mgt122-60",cpu="1",memory="512"} 2
- Add new instance with different offering
- Repeat step two. The output will change to:
cloudstack_vms_total_by_size{zone="mgt122-60",cpu="1",memory="512"} 2
cloudstack_vms_total_by_size\{zone="mgt122-60",cpu="1",memory="1024"} 1
Hi @soreana is this PR ready for review? @blueorangutan package
@nvazquez a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.
Packaging result: :heavy_multiplication_x: centos7 :heavy_check_mark: centos8 :heavy_multiplication_x: debian. SL-JID 444
Hey @nvazquez Yes, it is ready for review.
@blueorangutan package
@nvazquez a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.
Packaging result: :heavy_check_mark: centos7 :heavy_check_mark: centos8 :heavy_check_mark: debian. SL-JID 452
@blueorangutan test
@nvazquez a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests
Trillian test result (tid-1191) Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7 Total time taken: 38732 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr4438-t1191-kvm-centos7.zip Intermittent failure detected: /marvin/tests/smoke/test_vpc_redundant.py Smoke tests completed. 88 look OK, 0 have error(s) Only failed tests results shown below:
Test | Result | Time (s) | Test File |
---|
@blueorangutan package
@nvazquez a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.
Packaging result: :heavy_check_mark: el7 :heavy_check_mark: el8 :heavy_check_mark: debian. SL-JID 814
@blueorangutan test
@nvazquez a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests
@blueorangutan package
@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.
Packaging result: :heavy_multiplication_x: el7 :heavy_check_mark: el8 :heavy_multiplication_x: debian :heavy_check_mark: suse15. SL-JID 1165
@blueorangutan package
@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.
Packaging result: :heavy_check_mark: el7 :heavy_check_mark: el8 :heavy_check_mark: debian :heavy_check_mark: suse15. SL-JID 1237
@blueorangutan test
@sureshanaparti a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests
@blueorangutan test
@nvazquez a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests
Trillian test result (tid-2077) Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7 Total time taken: 41441 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr4438-t2077-kvm-centos7.zip Smoke tests completed. 85 look OK, 4 have errors Only failed tests results shown below:
Test | Result | Time (s) | Test File |
---|---|---|---|
test_01_add_primary_storage_disabled_host | Error |
1.22 | test_primary_storage.py |
test_01_primary_storage_nfs | Error |
0.13 | test_primary_storage.py |
ContextSuite context=TestStorageTags>:setup | Error |
0.23 | test_primary_storage.py |
test_02_list_snapshots_with_removed_data_store | Error |
1.31 | test_snapshots.py |
test_01_secure_vm_migration | Error |
164.57 | test_vm_life_cycle.py |
test_02_unsecure_vm_migration | Error |
276.33 | test_vm_life_cycle.py |
test_03_secured_to_nonsecured_vm_migration | Error |
148.04 | test_vm_life_cycle.py |
test_08_migrate_vm | Error |
44.80 | test_vm_life_cycle.py |
test_hostha_enable_ha_when_host_in_maintenance | Error |
307.19 | test_hostha_kvm.py |
@blueorangutan test
@nvazquez a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests
Trillian test result (tid-2135) Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7 Total time taken: 50191 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr4438-t2135-kvm-centos7.zip Smoke tests completed. 90 look OK, 3 have errors Only failed tests results shown below:
Test | Result | Time (s) | Test File |
---|---|---|---|
test_deploy_vm_start_failure | Error |
61.27 | test_deploy_vm.py |
test_deploy_vm_volume_creation_failure | Error |
61.36 | test_deploy_vm.py |
test_vm_ha | Error |
59.33 | test_vm_ha.py |
test_vm_sync | Error |
129.03 | test_vm_sync.py |
@blueorangutan package