cloudstack icon indicating copy to clipboard operation
cloudstack copied to clipboard

Fix XenServer nested extra configuration

Open gpordeus opened this issue 1 year ago • 12 comments

Description

This PR fixes a problem when trying to boot up a XenServer VM with a template through an ISO.

Steps to reproduce:

  • Enable enable.additional.vm.configuration and add HVM-boot-params:order to allow.additional.vm.configuration.list.xen.
  • Create a XenServer VM with a template.
  • Attach an ISO to it.
  • Call updateVirtualMachine with parameter extraconfig=HVM-boot-params:order%3Ddcn
  • Start up the VM.

It will not finish, with the following log message:

INFO  [c.c.h.x.r.XenServer650Resource] (DirectAgent-71:[ctx-6c7bfe3d]) (logid:faabbafa) Appending user extra configuration settings to VM
ERROR [o.a.c.h.x.ExtraConfigurationUtility] (DirectAgent-71:[ctx-6c7bfe3d]) (logid:faabbafa) Exception caught while setting VM configuration. exception: null

After fixing the exception message, we get:

ERROR [o.a.c.h.x.ExtraConfigurationUtility] (DirectAgent-25:ctx-0d31d123) (logid:aa40d07d) Exception caught while setting VM configuration: [You tried to add a key-value pair to a map, but that key is already there.]

This PR fixes the exception message and changes the Xen API method called from addTo to set with get, letting Java substitute any duplicates in the HashMap.

Types of changes

  • [ ] Breaking change (fix or feature that would cause existing functionality to change)
  • [ ] New feature (non-breaking change which adds functionality)
  • [X] Bug fix (non-breaking change which fixes an issue)
  • [ ] Enhancement (improves an existing feature and functionality)
  • [ ] Cleanup (Code refactoring and cleanup, that may add test cases)
  • [ ] build/CI

Feature/Enhancement Scale or Bug Severity

Bug Severity

  • [ ] BLOCKER
  • [ ] Critical
  • [X] Major
  • [ ] Minor
  • [ ] Trivial

Screenshots (if appropriate):

How Has This Been Tested?

I repeated the steps to reproduce the error and verified the VM with template started up sucessfully, booting the ISO.

gpordeus avatar Feb 23 '24 13:02 gpordeus

Codecov Report

Attention: Patch coverage is 0% with 11 lines in your changes are missing coverage. Please review.

Project coverage is 30.98%. Comparing base (8f6721e) to head (df0cd8e). Report is 23 commits behind head on main.

Files Patch % Lines
...ypervisor/xenserver/ExtraConfigurationUtility.java 0.00% 10 Missing :warning:
...ervisor/xenserver/resource/CitrixResourceBase.java 0.00% 1 Missing :warning:
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #8698      +/-   ##
============================================
+ Coverage     30.90%   30.98%   +0.08%     
- Complexity    33382    33501     +119     
============================================
  Files          5355     5355              
  Lines        375727   375784      +57     
  Branches      54915    54914       -1     
============================================
+ Hits         116109   116444     +335     
+ Misses       244219   243903     -316     
- Partials      15399    15437      +38     
Flag Coverage Δ
simulator-marvin-tests 24.85% <0.00%> (+0.09%) :arrow_up:
uitests 4.36% <ø> (+<0.01%) :arrow_up:
unit-tests 16.58% <0.00%> (+0.01%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Feb 23 '24 14:02 codecov[bot]

@DaanHoogland Thanks for the review.

CLGTM, but don't you think the replacement code should be called inside the VirtualMachine.add<>() methods, @gpordeus ?

Do you mean the original methods (VM.addToVCPUsParams(), VM.addToHVMBootParams(), etc)? It would be better, but those are part of the xenapi :/

gpordeus avatar Feb 23 '24 16:02 gpordeus

@DaanHoogland Thanks for the review.

CLGTM, but don't you think the replacement code should be called inside the VirtualMachine.add<>() methods, @gpordeus ?

Do you mean the original methods (VM.addToVCPUsParams(), VM.addToHVMBootParams(), etc)? It would be better, but those are part of the xenapi :/

right, I thought these were in a wrapper :( ok, don't bother

DaanHoogland avatar Feb 23 '24 16:02 DaanHoogland

@blueorangutan package

DaanHoogland avatar Feb 23 '24 16:02 DaanHoogland

@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

blueorangutan avatar Feb 23 '24 16:02 blueorangutan

Packaging result [SF]: ✖️ el7 ✖️ el8 ✖️ el9 ✖️ debian ✖️ suse15. SL-JID 8761

blueorangutan avatar Feb 23 '24 16:02 blueorangutan

@gpordeus Unit tests are failing during build. Can you check & fix those?

vishesh92 avatar Feb 26 '24 04:02 vishesh92

@gpordeus Unit tests are failing during build. Can you check & fix those?

@DaanHoogland @vishesh92 It seems the test failed because dl.openvm.eu was down at the time. Could you run them again?

[INFO] Installing /home/runner/work/cloudstack/cloudstack/framework/quota/pom.xml to /home/runner/.m2/repository/org/apache/cloudstack/cloud-framework-quota/4.20.0.0-SNAPSHOT/cloud-framework-quota-4.20.0.0-SNAPSHOT.pom
16:29:16.334 [main] ERROR org.apache.cloudstack.direct.download.HttpDirectTemplateDownloader - Cannot reach URL: http://dl.openvm.eu/cloudstack/macchinina/x86_64/macchinina-kvm.qcow2.bz2 due to: dl.openvm.eu
java.net.UnknownHostException: dl.openvm.eu
	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:229) ~[?:?]
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:?]
	at java.net.Socket.connect(Socket.java:609) ~[?:?]
	at java.net.Socket.connect(Socket.java:558) ~[?:?]
	at java.net.Socket.<init>(Socket.java:454) ~[?:?]
	at java.net.Socket.<init>(Socket.java:306) ~[?:?]
	at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80) ~[commons-httpclient-3.1.jar:?]
	at org.apache.commons.httpclient.protocol.ControllerThreadSocketFactory$1.doit(ControllerThreadSocketFactory.java:91) ~[commons-httpclient-3.1.jar:?]
	at org.apache.commons.httpclient.protocol.ControllerThreadSocketFactory$SocketTask.run(ControllerThreadSocketFactory.java:158) ~[commons-httpclient-3.1.jar:?]
	at java.lang.Thread.run(Thread.java:829) ~[?:?]
Error:  Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 10.965 s <<< FAILURE! - in org.apache.cloudstack.direct.download.MetalinkDirectTemplateDownloaderTest
Error:  testCheckUrlMetalink(org.apache.cloudstack.direct.download.MetalinkDirectTemplateDownloaderTest)  Time elapsed: 10.964 s  <<< FAILURE!
java.lang.AssertionError
	at org.apache.cloudstack.direct.download.MetalinkDirectTemplateDownloaderTest.testCheckUrlMetalink(MetalinkDirectTemplateDownloaderTest.java:34)

gpordeus avatar Feb 26 '24 20:02 gpordeus

@gpordeus Unit tests are failing during build. Can you check & fix those?

@DaanHoogland @vishesh92 It seems the test failed because dl.openvm.eu was down at the time. Could you run them again?

[INFO] Installing /home/runner/work/cloudstack/cloudstack/framework/quota/pom.xml to /home/runner/.m2/repository/org/apache/cloudstack/cloud-framework-quota/4.20.0.0-SNAPSHOT/cloud-framework-quota-4.20.0.0-SNAPSHOT.pom
16:29:16.334 [main] ERROR org.apache.cloudstack.direct.download.HttpDirectTemplateDownloader - Cannot reach URL: http://dl.openvm.eu/cloudstack/macchinina/x86_64/macchinina-kvm.qcow2.bz2 due to: dl.openvm.eu
java.net.UnknownHostException: dl.openvm.eu
	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:229) ~[?:?]
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:?]
	at java.net.Socket.connect(Socket.java:609) ~[?:?]
	at java.net.Socket.connect(Socket.java:558) ~[?:?]
	at java.net.Socket.<init>(Socket.java:454) ~[?:?]
	at java.net.Socket.<init>(Socket.java:306) ~[?:?]
	at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80) ~[commons-httpclient-3.1.jar:?]
	at org.apache.commons.httpclient.protocol.ControllerThreadSocketFactory$1.doit(ControllerThreadSocketFactory.java:91) ~[commons-httpclient-3.1.jar:?]
	at org.apache.commons.httpclient.protocol.ControllerThreadSocketFactory$SocketTask.run(ControllerThreadSocketFactory.java:158) ~[commons-httpclient-3.1.jar:?]
	at java.lang.Thread.run(Thread.java:829) ~[?:?]
Error:  Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 10.965 s <<< FAILURE! - in org.apache.cloudstack.direct.download.MetalinkDirectTemplateDownloaderTest
Error:  testCheckUrlMetalink(org.apache.cloudstack.direct.download.MetalinkDirectTemplateDownloaderTest)  Time elapsed: 10.964 s  <<< FAILURE!
java.lang.AssertionError
	at org.apache.cloudstack.direct.download.MetalinkDirectTemplateDownloaderTest.testCheckUrlMetalink(MetalinkDirectTemplateDownloaderTest.java:34)

The server is not down. But the domain name can not be resolved by some DNS servers (e.g. 8.8.8.8) Please try 1.1.1.1

@DaanHoogland @NuxRo We probably can move templates to download.cloudstack.org

weizhouapache avatar Feb 26 '24 20:02 weizhouapache

@blueorangutan package

DaanHoogland avatar Feb 26 '24 20:02 DaanHoogland

@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

blueorangutan avatar Feb 26 '24 20:02 blueorangutan

Packaging result [SF]: ✖️ el7 ✖️ el8 ✖️ el9 ✖️ debian ✖️ suse15. SL-JID 8795

blueorangutan avatar Feb 26 '24 20:02 blueorangutan

@blueorangutan package

DaanHoogland avatar Feb 28 '24 07:02 DaanHoogland

@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

blueorangutan avatar Feb 28 '24 07:02 blueorangutan

Packaging result [SF]: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 8804

blueorangutan avatar Feb 28 '24 08:02 blueorangutan

@blueorangutan test alma9 xcpng82

DaanHoogland avatar Feb 28 '24 09:02 DaanHoogland

@DaanHoogland a [SL] Trillian-Jenkins test job (alma9 mgmt + xcpng82) has been kicked to run smoke tests

blueorangutan avatar Feb 28 '24 09:02 blueorangutan

[SF] Trillian test result (tid-9338) Environment: xcpng82 (x2), Advanced Networking with Mgmt server a9 Total time taken: 68936 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr8698-t9338-xcpng82.zip Smoke tests completed. 125 look OK, 4 have errors, 0 did not run Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_01_condensed_drs_algorithm Failure 170.44 test_cluster_drs.py
test_02_balanced_drs_algorithm Failure 180.24 test_cluster_drs.py
test_01_non_strict_host_anti_affinity Error 207.99 test_nonstrict_affinity_group.py
test_02_non_strict_host_affinity Error 114.43 test_nonstrict_affinity_group.py
test_deploy_vm_with_registered_userdata Error 829.04 test_register_userdata.py
test_deploy_vm_with_registered_userdata_with_override_policy_allow Error 859.04 test_register_userdata.py
test_deploy_vm_with_registered_userdata_with_override_policy_append Error 826.68 test_register_userdata.py
test_deploy_vm_with_registered_userdata_with_params Error 840.29 test_register_userdata.py
test_01_deploy_vm_on_specific_host Error 0.09 test_vm_deployment_planner.py
test_02_deploy_vm_on_specific_cluster Error 0.09 test_vm_deployment_planner.py
test_03_deploy_vm_on_specific_pod Error 0.11 test_vm_deployment_planner.py
test_04_deploy_vm_on_host_override_pod_and_cluster Error 0.15 test_vm_deployment_planner.py
test_05_deploy_vm_on_cluster_override_pod Error 0.10 test_vm_deployment_planner.py

blueorangutan avatar Feb 29 '24 05:02 blueorangutan

alma9 xcpng82

@gpordeus the test run with these env specs fails a bit, is that expected? what did you use?

DaanHoogland avatar Mar 04 '24 08:03 DaanHoogland

@gpordeus the test run with these env specs fails a bit, is that expected? what did you use?

I tested it manually on a local ubuntu lab with xcp-ng-8.2 host. I didn't run the integration tests, but I believe these are related to #8586, since the same tests are acting up (cluster_drs, nonstrict_affinity_group, register_userdata and vm_deployment_planner).

gpordeus avatar Mar 12 '24 16:03 gpordeus

@gpordeus the test run with these env specs fails a bit, is that expected? what did you use?

I tested it manually on a local ubuntu lab with xcp-ng-8.2 host. I didn't run the integration tests, but I believe these are related to #8586, since the same tests are acting up (cluster_drs, nonstrict_affinity_group, register_userdata and vm_deployment_planner).

@weizhouapache , can you review? @gpordeus , can you run the integration tests to see?

DaanHoogland avatar Mar 15 '24 14:03 DaanHoogland

@blueorangutan test rocky8 xenserver-71

weizhouapache avatar Mar 15 '24 14:03 weizhouapache

@weizhouapache a [SL] Trillian-Jenkins test job (rocky8 mgmt + xenserver-71) has been kicked to run smoke tests

blueorangutan avatar Mar 15 '24 14:03 blueorangutan

[SF] Trillian test result (tid-9505) Environment: xenserver-71 (x2), Advanced Networking with Mgmt server r8 Total time taken: 44292 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr8698-t9505-xenserver-71.zip Smoke tests completed. 129 look OK, 0 have errors, 0 did not run Only failed and skipped tests results shown below:

Test Result Time (s) Test File

blueorangutan avatar Mar 16 '24 06:03 blueorangutan

@vishesh92 is this alright to you?

DaanHoogland avatar Mar 21 '24 12:03 DaanHoogland