cloudstack icon indicating copy to clipboard operation
cloudstack copied to clipboard

Add cpu speed detection methods

Open BartJM opened this issue 1 year ago • 7 comments
trafficstars

Description

This PR ads two additional methods to detect cpu speed on kvm hosts. This will improve the speed detection on AMD Epyc cpu's. For cpu's where the Ghz is in the model name no change will occur. For other cpu's the detected cpu speed ca change to the max Mhz of the cpu.

  1. A match on the CPU max MHz value from lscpu
  2. An additional sysfs file scaling_max_freq

Fixes: #6914

Types of changes

  • [ ] Breaking change (fix or feature that would cause existing functionality to change)
  • [ ] New feature (non-breaking change which adds functionality)
  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [x] Enhancement (improves an existing feature and functionality)
  • [ ] Cleanup (Code refactoring and cleanup, that may add test cases)
  • [ ] build/CI
  • [ ] test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • [ ] Major
  • [x] Minor

How Has This Been Tested?

Tested on a kvm host with an AMD EPYC 7601 cpu.

  • With normal agent start the cpu speed is detected as the expected 2200Mhz.
  • Removed the cpu max Mhz line from the lscpu output and restarted agent. The detected speed is still the expected 2200Mhz.

On an kvm centos8 vm without the lscpu matches and neither file the agent still falls back on host capabilities.

BartJM avatar Oct 03 '24 11:10 BartJM

@blueorangutan package

sureshanaparti avatar Oct 03 '24 12:10 sureshanaparti

@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

blueorangutan avatar Oct 03 '24 12:10 blueorangutan

Codecov Report

Attention: Patch coverage is 68.18182% with 7 lines in your changes missing coverage. Please review.

Project coverage is 15.78%. Comparing base (019f2c6) to head (78a981f). Report is 206 commits behind head on 4.20.

Files with missing lines Patch % Lines
...org/apache/cloudstack/utils/linux/KVMHostInfo.java 68.18% 7 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##               4.20    #9762    +/-   ##
==========================================
  Coverage     15.78%   15.78%            
- Complexity    12564    12565     +1     
==========================================
  Files          5627     5627            
  Lines        492250   492261    +11     
  Branches      61405    62190   +785     
==========================================
+ Hits          77710    77718     +8     
- Misses       406066   406070     +4     
+ Partials       8474     8473     -1     
Flag Coverage Δ
uitests 4.04% <ø> (ø)
unittests 16.60% <68.18%> (+<0.01%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Oct 03 '24 12:10 codecov[bot]

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 11266

blueorangutan avatar Oct 03 '24 13:10 blueorangutan

@blueorangutan test

sureshanaparti avatar Oct 04 '24 19:10 sureshanaparti

@sureshanaparti a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

blueorangutan avatar Oct 04 '24 19:10 blueorangutan

[SF] Trillian test result (tid-11619) Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8 Total time taken: 61373 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr9762-t11619-kvm-ol8.zip Smoke tests completed. 140 look OK, 1 have errors, 0 did not run Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_01_secure_vm_migration Error 134.25 test_vm_life_cycle.py
test_01_secure_vm_migration Error 134.25 test_vm_life_cycle.py

blueorangutan avatar Oct 05 '24 12:10 blueorangutan

@DaanHoogland @BartJM looks like some unwanted commits were added to this PR, probably due to the force push in the main. @DaanHoogland, could you take a look on this?

@BartJM , you want to execute

git rebase --onto main c087de4adfe0db02802ec4fe0929a5b3d6dfba2a 0ceff7f5b4cfdd2b2f26591d933eb112d8cf2329

and force push (git push --force) your branch. or alternatively start a new branch and git cherry-pick 0ceff7f5b4cfdd2b2f26591d933eb112d8cf2329 on that new branch. Then rename it to replace the branch in this PR or start a new PR.

DaanHoogland avatar Oct 23 '24 12:10 DaanHoogland

@blueorangutan test ubuntu24 kvm-ubuntu24

weizhouapache avatar Nov 25 '24 12:11 weizhouapache

[SF] Trillian test result (tid-11801) Environment: kvm-ubuntu22 (x2), Advanced Networking with Mgmt server u22 Total time taken: 57205 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr9762-t11801-kvm-ubuntu22.zip Smoke tests completed. 139 look OK, 2 have errors, 0 did not run Only failed and skipped tests results shown below:

Test Result Time (s) Test File
ContextSuite context=TestClusterDRS>:setup Error 0.00 test_cluster_drs.py
test_hostha_enable_ha_when_host_disabled Error 3.01 test_hostha_kvm.py
test_hostha_enable_ha_when_host_in_maintenance Error 302.14 test_hostha_kvm.py

blueorangutan avatar Nov 26 '24 11:11 blueorangutan

@kiranchavala can you check thsi and see if this fixes #9819

DaanHoogland avatar Dec 06 '24 13:12 DaanHoogland

@kiranchavala can you check thsi and see if this fixes #9819

Sure @DaanHoogland I will take a look

kiranchavala avatar Dec 18 '24 12:12 kiranchavala

@BartJM could you rebase this to 4.20. So that we could have this in the 4.20.1 release. Thanks.

Pearl1594 avatar Feb 12 '25 18:02 Pearl1594

@blueorangutan package

Pearl1594 avatar Feb 14 '25 14:02 Pearl1594

@blueorangutan package

Pearl1594 avatar Feb 14 '25 15:02 Pearl1594

@Pearl1594 a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

blueorangutan avatar Feb 14 '25 16:02 blueorangutan

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 12469

blueorangutan avatar Feb 14 '25 17:02 blueorangutan

@blueorangutan test

Pearl1594 avatar Feb 18 '25 15:02 Pearl1594

@Pearl1594 a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

blueorangutan avatar Feb 18 '25 15:02 blueorangutan

[SF] Trillian test result (tid-12468) Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8 Total time taken: 55829 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr9762-t12468-kvm-ol8.zip Smoke tests completed. 139 look OK, 2 have errors, 0 did not run Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_11_isolated_network_with_dynamic_routed_mode Error 2.33 test_ipv4_routing.py
test_12_vpc_and_tier_with_dynamic_routed_mode Error 3.47 test_ipv4_routing.py
test_12_vpc_and_tier_with_dynamic_routed_mode Error 3.48 test_ipv4_routing.py
test_06_purge_expunged_vm_background_task Failure 391.44 test_purge_expunged_vms.py

blueorangutan avatar Feb 19 '25 07:02 blueorangutan

[SF] Trillian test result (tid-12468) Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8 Total time taken: 55829 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr9762-t12468-kvm-ol8.zip Smoke tests completed. 139 look OK, 2 have errors, 0 did not run Only failed and skipped tests results shown below: Test Result Time (s) Test File test_11_isolated_network_with_dynamic_routed_mode Error 2.33 test_ipv4_routing.py test_12_vpc_and_tier_with_dynamic_routed_mode Error 3.47 test_ipv4_routing.py test_12_vpc_and_tier_with_dynamic_routed_mode Error 3.48 test_ipv4_routing.py test_06_purge_expunged_vm_background_task Failure 391.44 test_purge_expunged_vms.py

@Pearl1594 , cc @kiranchavala , these errors seem consistent on 4.20, lately. Can we merge this?

DaanHoogland avatar Feb 19 '25 10:02 DaanHoogland

@Pearl1594 , cc @kiranchavala , these errors seem consistent on 4.20, lately. Can we merge this?

@DaanHoogland I have seen these exact failures in other PRs. I think we are safe to merge here.

JoaoJandre avatar Feb 19 '25 11:02 JoaoJandre