cloudstack icon indicating copy to clipboard operation
cloudstack copied to clipboard

Fix memory stats for KVM

Open joseflauzino opened this issue 2 years ago • 31 comments

Description

Using KVM hypervisor, the free memory stats of the user VMs that are returned by ACS do not correspond to those observed directly in the VMs (when using the free -m command, for example).

It was identified that the problem was in the memory stats collection process performed by Libvirt. This happens because the period that Libvirt would get memory stats updates for each VM was never set. Thus, even if ACS periodically requested the stats for the hosts, the data obtained were always the same.

This PR solves the mentioned problem. A new Agent configuration parameter called vm.memballoon.stats.period was created. This parameter allows operatos to set the time interval in which Libvirt will get memory stats updates. The default value of this parameter is 60 seconds, the same default value as the vm.stats.interval parameter. However, in a cloud containing multiple Management Servers, operators can set the value of vm.memballoon.stats.period lower than the vm.stats.interval, so that each Management Server always gets updated memory stats for all user VMs (since each Management Server can request stats at different instants).

Types of changes

  • [ ] Breaking change (fix or feature that would cause existing functionality to change)
  • [ ] New feature (non-breaking change which adds functionality)
  • [x] Bug fix (non-breaking change which fixes an issue)
  • [ ] Enhancement (improves an existing feature and functionality)
  • [ ] Cleanup (Code refactoring and cleanup, that may add test cases)

Feature/Enhancement Scale or Bug Severity

Bug Severity

  • [ ] BLOCKER
  • [ ] Critical
  • [ ] Major
  • [x] Minor
  • [ ] Trivial

How Has This Been Tested?

In a local lab, I tested multiple combinations for all the related Agent properties (vm.memballoon.stats.period, vm.stats.interval, and vm.memballoon.disable). In all cases tested, the memory stats of the user VMs were obtained properly. Also, I added unit tests.

joseflauzino avatar May 04 '22 19:05 joseflauzino

@blueorangutan package

joseflauzino avatar May 06 '22 11:05 joseflauzino

@joseflauzino a Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

blueorangutan avatar May 06 '22 11:05 blueorangutan

Packaging result: :heavy_check_mark: el7 :heavy_check_mark: el8 :heavy_check_mark: debian :heavy_check_mark: suse15. SL-JID 3358

blueorangutan avatar May 06 '22 12:05 blueorangutan

This pull request has merge conflicts. Dear author, please fix the conflicts and sync your branch with the base branch.

github-actions[bot] avatar May 21 '22 14:05 github-actions[bot]

Found UI changes, kicking a new UI QA build @blueorangutan ui

acs-robot avatar May 21 '22 14:05 acs-robot

I moved the code to a more appropriate location. Also, I adjusted the code to get the VM list (Libvirt domains) by using libvirt-java instead of a virsh command. Thanks for the reviews.

joseflauzino avatar May 21 '22 14:05 joseflauzino

Found UI changes, kicking a new UI QA build @blueorangutan ui

acs-robot avatar May 21 '22 14:05 acs-robot

@acs-robot a Jenkins job has been kicked to build UI QA env. I'll keep you posted as I make progress.

blueorangutan avatar May 21 '22 14:05 blueorangutan

UI build: :heavy_check_mark: Live QA URL: http://qa.cloudstack.cloud:8080/client/pr/6358 (SL-JID-1609)

blueorangutan avatar May 21 '22 14:05 blueorangutan

This pull request has merge conflicts. Dear author, please fix the conflicts and sync your branch with the base branch.

github-actions[bot] avatar May 23 '22 13:05 github-actions[bot]

@blueorangutan package

joseflauzino avatar Jun 03 '22 13:06 joseflauzino

@joseflauzino a Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

blueorangutan avatar Jun 03 '22 13:06 blueorangutan

Packaging result: :heavy_check_mark: el7 :heavy_check_mark: el8 :heavy_check_mark: debian :heavy_check_mark: suse15. SL-JID 3525

blueorangutan avatar Jun 03 '22 14:06 blueorangutan

ping @joseflauzino can you address the review comments?

rohityadavcloud avatar Jul 27 '22 09:07 rohityadavcloud

@weizhouapache, thanks for the review. I addressed the comments with the most straightforward solutions and will address the remaining ASAP.

joseflauzino avatar Jul 27 '22 19:07 joseflauzino

Thanks for the review, @GabrielBrascher. I will address your comments as well. I will be pushing a new version of the code including all the revisions in the next few days.

joseflauzino avatar Aug 01 '22 11:08 joseflauzino

Thanks for the review, @weizhouapache and @GabrielBrascher.

I have just committed the new version of the code. These are the relevant changes:

  1. Now the stats parameter of the memballoon tag will be set to 0 when vm.memballoon.disable=true.

  2. Virsh commands have been replaced by methods from the Domain class. The only exception is the command to set the stats parameter of the memballoon tag (virsh dommemstat <vm-name-or-id> --period <value-in-seconds> --live), which cannot be replaced by the memoryStats(int number) method. The reason is that this method only allows us to get the memory stats and not to set the new gathering period in the VM XML (as desired).

With that said, I will resolve all related threads.

joseflauzino avatar Aug 11 '22 15:08 joseflauzino

Found UI changes, kicking a new UI QA build @blueorangutan ui

acs-robot avatar Aug 11 '22 15:08 acs-robot

@acs-robot a Jenkins job has been kicked to build UI QA env. I'll keep you posted as I make progress.

blueorangutan avatar Aug 11 '22 15:08 blueorangutan

UI build: :heavy_check_mark: Live QA URL: http://qa.cloudstack.cloud:8080/client/pr/6358 (SL-JID-2130)

blueorangutan avatar Aug 11 '22 15:08 blueorangutan

@blueorangutan package

joseflauzino avatar Aug 11 '22 15:08 joseflauzino

@joseflauzino a Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

blueorangutan avatar Aug 11 '22 15:08 blueorangutan

Codecov Report

Merging #6358 (6a7f177) into main (1ee58ec) will increase coverage by 0.02%. The diff coverage is 92.85%.

@@             Coverage Diff              @@
##               main    #6358      +/-   ##
============================================
+ Coverage     10.85%   10.87%   +0.02%     
- Complexity     7106     7117      +11     
============================================
  Files          2485     2485              
  Lines        245417   245499      +82     
  Branches      38326    38334       +8     
============================================
+ Hits          26631    26699      +68     
- Misses       215516   215530      +14     
  Partials       3270     3270              
Impacted Files Coverage Δ
...ypervisor/kvm/resource/LibvirtDomainXMLParser.java 50.61% <91.66%> (+2.11%) :arrow_up:
...om/cloud/hypervisor/kvm/resource/LibvirtVMDef.java 67.65% <92.30%> (+0.61%) :arrow_up:
...ervisor/kvm/resource/LibvirtComputingResource.java 17.03% <93.47%> (+1.34%) :arrow_up:
...rg/apache/cloudstack/quota/QuotaStatementImpl.java 36.28% <0.00%> (-3.99%) :arrow_down:

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

codecov[bot] avatar Aug 11 '22 15:08 codecov[bot]

Packaging result: :heavy_check_mark: el7 :heavy_check_mark: el8 :heavy_check_mark: debian :heavy_check_mark: suse15. SL-JID 3964

blueorangutan avatar Aug 11 '22 15:08 blueorangutan

@blueorangutan package

shwstppr avatar Aug 16 '22 12:08 shwstppr

@shwstppr a Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

blueorangutan avatar Aug 16 '22 12:08 blueorangutan

Packaging result: :heavy_multiplication_x: el7 :heavy_multiplication_x: el8 :heavy_multiplication_x: debian :heavy_multiplication_x: suse15. SL-JID 4003

blueorangutan avatar Aug 16 '22 13:08 blueorangutan

Packaging result: :heavy_check_mark: el7 :heavy_check_mark: el8 :heavy_check_mark: debian :heavy_check_mark: suse15. SL-JID 4014

blueorangutan avatar Aug 18 '22 08:08 blueorangutan