cloudstack icon indicating copy to clipboard operation
cloudstack copied to clipboard

[4.20] VR: fix issue if userdata is binary data

Open weizhouapache opened this issue 11 months ago • 17 comments

Description

This PR fixes the issue that VM cannot start with binary userdata. This is a regression of #8497

Steps to reproduce the issue and verify the fix

  1. create a userdata with content below cloud-init supports gzip compress userdata. for example
$ echo "Apache CloudStack" |gzip |base64 -w0
H4sIAAAAAAAAA3MsSEzOSFVwzskvTQkuSUzO5gIAxXcOkxIAAAA=
  1. deploy a vm with the userdata, it failed
2024-03-04T11:13:27,400 DEBUG [c.c.a.t.Request] (AgentManager-Handler-7:[]) (logid:) Seq 1-8288593639198687341: 
Processing:  { Ans: , MgmtId: 167781234, via: 1, Ver: v1, Flags: 10, [{"com.cloud.agent.api.routing.GroupAnswer":{"results":
["null - success: Creating file in VR, with ip: 169.254.52.151, file: vm_password.json.5f0a36da-6fa7-48e4-8aae-
dcc8923fc42d","null - success: "],"result":"true","wait":"0","bypassHostMaintenance":"false"}},
{"com.cloud.agent.api.routing.GroupAnswer":{"results":["null - success: Creating file in VR, with ip: 169.254.52.151, file: 
vm_metadata.json.53c05bda-dc05-49df-b548-f517ae0fa546","null - failed: Traceback (most recent call last):  File 
"/opt/cloud/bin/update_config.py", line 147, in <module>    process_file()  File "/opt/cloud/bin/update_config.py", line 57, in 
process_file    finish_config()  File "/opt/cloud/bin/update_config.py", line 42, in finish_config    returncode = 
configure.main(sys.argv)                 ^^^^^^^^^^^^^^^^^^^^^^^^  File "/opt/cloud/bin/configure.py", line 1413, in main    
execDatabag(json_type, databag_map)  File "/opt/cloud/bin/configure.py", line 1400, in execDatabag    executor.process()  File 
"/opt/cloud/bin/configure.py", line 625, in process    self.__createfile(ip, folder, file, data)  File "/opt/cloud/bin/configure.py", line 
652, in __createfile    fh.write(data.decode())             ^^^^^^^^^^^^^UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in 
position 1: invalid start byte"],"result":"false","wait":"0","bypassHostMaintenance":"false"}}] }
  1. apply this PR, vm is started successfully
  2. check the userdata in vm image

Types of changes

  • [ ] Breaking change (fix or feature that would cause existing functionality to change)
  • [ ] New feature (non-breaking change which adds functionality)
  • [x] Bug fix (non-breaking change which fixes an issue)
  • [ ] Enhancement (improves an existing feature and functionality)
  • [ ] Cleanup (Code refactoring and cleanup, that may add test cases)
  • [ ] build/CI

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • [ ] Major
  • [ ] Minor

Bug Severity

  • [ ] BLOCKER
  • [x] Critical
  • [ ] Major
  • [ ] Minor
  • [ ] Trivial

Screenshots (if appropriate):

How Has This Been Tested?

How did you try to break this feature and the system with this change?

weizhouapache avatar Mar 04 '24 11:03 weizhouapache

@blueorangutan package

weizhouapache avatar Mar 04 '24 11:03 weizhouapache

@weizhouapache a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

blueorangutan avatar Mar 04 '24 11:03 blueorangutan

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 15.53%. Comparing base (cea4801) to head (ea6f47b). Report is 27 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##               main    #8739    +/-   ##
==========================================
  Coverage     15.53%   15.53%            
  Complexity    11967    11967            
==========================================
  Files          5492     5492            
  Lines        480934   480934            
  Branches      60876    60056   -820     
==========================================
  Hits          74711    74711            
  Misses       397962   397962            
  Partials       8261     8261            
Flag Coverage Δ
uitests 4.17% <ø> (ø)
unittests 16.30% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Mar 04 '24 11:03 codecov[bot]

Packaging result [SF]: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 8847

blueorangutan avatar Mar 04 '24 12:03 blueorangutan

@blueorangutan test matrix

weizhouapache avatar Mar 04 '24 12:03 weizhouapache

@weizhouapache a [SL] Trillian-Jenkins matrix job (centos7 mgmt + xenserver71, rocky8 mgmt + vmware67u3, centos7 mgmt + kvmcentos7) has been kicked to run smoke tests

blueorangutan avatar Mar 04 '24 12:03 blueorangutan

[SF] Trillian test result (tid-9387) Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7 Total time taken: 43267 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr8739-t9387-kvm-centos7.zip Smoke tests completed. 129 look OK, 0 have errors, 0 did not run Only failed and skipped tests results shown below:

Test Result Time (s) Test File

blueorangutan avatar Mar 05 '24 01:03 blueorangutan

[SF] Trillian test result (tid-9385) Environment: xenserver-71 (x2), Advanced Networking with Mgmt server 7 Total time taken: 44270 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr8739-t9385-xenserver-71.zip Smoke tests completed. 129 look OK, 0 have errors, 0 did not run Only failed and skipped tests results shown below:

Test Result Time (s) Test File

blueorangutan avatar Mar 05 '24 01:03 blueorangutan

[SF] Trillian test result (tid-9386) Environment: vmware-67u3 (x2), Advanced Networking with Mgmt server r8 Total time taken: 49495 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr8739-t9386-vmware-67u3.zip Smoke tests completed. 128 look OK, 1 have errors, 0 did not run Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_02_balanced_drs_algorithm Error 425.77 test_cluster_drs.py

blueorangutan avatar Mar 05 '24 02:03 blueorangutan

@blueorangutan package

weizhouapache avatar Apr 24 '24 07:04 weizhouapache

@weizhouapache a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

blueorangutan avatar Apr 24 '24 07:04 blueorangutan

Packaging result [SF]: ✔️ el7 ✖️ el8 ✖️ el9 ✖️ debian ✖️ suse15. SL-JID 9386

blueorangutan avatar Apr 24 '24 08:04 blueorangutan

@blueorangutan package

weizhouapache avatar Apr 24 '24 08:04 weizhouapache

@weizhouapache a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

blueorangutan avatar Apr 24 '24 08:04 blueorangutan

Packaging result [SF]: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 9392

blueorangutan avatar Apr 24 '24 09:04 blueorangutan

@blueorangutan test alma9 kvm-alma9

weizhouapache avatar Apr 30 '24 10:04 weizhouapache

This fixes the issue that CAPC cannot create kubernetes cluster on 4.20/main

cc @rohityadavcloud

weizhouapache avatar Apr 30 '24 10:04 weizhouapache

This pull request has merge conflicts. Dear author, please fix the conflicts and sync your branch with the base branch.

github-actions[bot] avatar Jul 10 '24 09:07 github-actions[bot]

@blueorangutan package

DaanHoogland avatar Jul 11 '24 10:07 DaanHoogland

@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

blueorangutan avatar Jul 11 '24 10:07 blueorangutan

Packaging result [SF]: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 10325

blueorangutan avatar Jul 11 '24 12:07 blueorangutan

@blueorangutan test

DaanHoogland avatar Jul 12 '24 07:07 DaanHoogland

@DaanHoogland a [SL] Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

blueorangutan avatar Jul 12 '24 07:07 blueorangutan

@DaanHoogland will you test this pr ? the PR description might help

weizhouapache avatar Jul 12 '24 07:07 weizhouapache

@DaanHoogland will you test this pr ? the PR description might help

will find time... (as if that would be a promise ;)

DaanHoogland avatar Jul 12 '24 08:07 DaanHoogland

@DaanHoogland will you test this pr ? the PR description might help

will find time... (as if that would be a promise ;)

ah ha, I consider it as a promise. thanks buddy @DaanHoogland

weizhouapache avatar Jul 12 '24 08:07 weizhouapache

[SF] Trillian test result (tid-10807) Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7 Total time taken: 65194 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr8739-t10807-kvm-centos7.zip Smoke tests completed. 113 look OK, 24 have errors, 0 did not run Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_role_account_acls_multiple_mgmt_servers Error 2.17 test_dynamicroles.py
test_query_async_job_result Error 105.94 test_async_job.py
test_revoke_certificate Error 0.01 test_certauthority_root.py
test_configure_ha_provider_invalid Error 0.01 test_hostha_simulator.py
test_configure_ha_provider_valid Error 0.01 test_hostha_simulator.py
test_ha_configure_enabledisable_across_clusterzones Error 0.01 test_hostha_simulator.py
test_ha_disable_feature_invalid Error 0.01 test_hostha_simulator.py
test_ha_enable_feature_invalid Error 0.01 test_hostha_simulator.py
test_ha_list_providers Error 0.01 test_hostha_simulator.py
test_ha_multiple_mgmt_server_ownership Error 0.01 test_hostha_simulator.py
test_ha_verify_fsm_available Error 0.01 test_hostha_simulator.py
test_ha_verify_fsm_degraded Error 0.01 test_hostha_simulator.py
test_ha_verify_fsm_fenced Error 0.01 test_hostha_simulator.py
test_ha_verify_fsm_recovering Error 0.01 test_hostha_simulator.py
test_hostha_configure_default_driver Error 0.01 test_hostha_simulator.py
test_hostha_configure_invalid_provider Error 0.01 test_hostha_simulator.py
test_hostha_disable_feature_valid Error 0.01 test_hostha_simulator.py
test_hostha_enable_feature_valid Error 0.01 test_hostha_simulator.py
test_hostha_enable_feature_without_setting_provider Error 0.01 test_hostha_simulator.py
test_list_ha_for_host Error 0.01 test_hostha_simulator.py
test_list_ha_for_host_invalid Error 0.01 test_hostha_simulator.py
test_list_ha_for_host_valid Error 0.01 test_hostha_simulator.py
test_01_host_ping_on_alert Error 0.10 test_host_ping.py
test_01_host_ping_on_alert Error 0.10 test_host_ping.py
test_01_browser_migrate_template Error 15.34 test_image_store_object_migration.py
test_01_invalid_upgrade_kubernetes_cluster Failure 237.87 test_kubernetes_clusters.py
test_02_upgrade_kubernetes_cluster Failure 247.94 test_kubernetes_clusters.py
test_03_deploy_and_scale_kubernetes_cluster Failure 223.49 test_kubernetes_clusters.py
test_04_autoscale_kubernetes_cluster Failure 239.74 test_kubernetes_clusters.py
test_05_basic_lifecycle_kubernetes_cluster Failure 222.24 test_kubernetes_clusters.py
test_06_delete_kubernetes_cluster Failure 230.60 test_kubernetes_clusters.py
test_08_upgrade_kubernetes_ha_cluster Failure 317.22 test_kubernetes_clusters.py
test_10_vpc_tier_kubernetes_cluster Failure 258.87 test_kubernetes_clusters.py
test_11_test_unmanaged_cluster_lifecycle Error 93.42 test_kubernetes_clusters.py
test_01_add_delete_kubernetes_supported_version Error 0.11 test_kubernetes_supported_versions.py
login_test_saml_user Error 3.06 test_login.py
test_01_deployVMInSharedNetwork Error 77.40 test_network.py
test_03_destroySharedNetwork Failure 1.07 test_network.py
ContextSuite context=TestSharedNetwork>:teardown Error 2.16 test_network.py
test_oobm_issue_power_cycle Error 3.30 test_outofbandmanagement_nestedplugin.py
test_oobm_issue_power_off Error 3.33 test_outofbandmanagement_nestedplugin.py
test_oobm_issue_power_on Error 3.31 test_outofbandmanagement_nestedplugin.py
test_oobm_issue_power_reset Error 2.26 test_outofbandmanagement_nestedplugin.py
test_oobm_issue_power_soft Error 3.28 test_outofbandmanagement_nestedplugin.py
test_oobm_issue_power_status Error 2.23 test_outofbandmanagement_nestedplugin.py
test_oobm_background_powerstate_sync Failure 21.58 test_outofbandmanagement.py
test_oobm_background_powerstate_sync Error 21.58 test_outofbandmanagement.py
test_oobm_configure_default_driver Error 0.05 test_outofbandmanagement.py
test_oobm_configure_invalid_driver Error 0.05 test_outofbandmanagement.py
test_oobm_disable_feature_invalid Error 0.04 test_outofbandmanagement.py
test_oobm_disable_feature_valid Error 1.14 test_outofbandmanagement.py
test_oobm_enable_feature_invalid Error 0.04 test_outofbandmanagement.py
test_oobm_enable_feature_valid Error 1.11 test_outofbandmanagement.py
test_oobm_enabledisable_across_clusterzones Error 11.82 test_outofbandmanagement.py
test_oobm_enabledisable_across_clusterzones Error 11.82 test_outofbandmanagement.py
test_oobm_issue_power_cycle Error 4.33 test_outofbandmanagement.py
test_oobm_issue_power_cycle Error 4.33 test_outofbandmanagement.py
test_oobm_issue_power_off Error 4.33 test_outofbandmanagement.py
test_oobm_issue_power_off Error 4.33 test_outofbandmanagement.py
test_oobm_issue_power_on Error 4.31 test_outofbandmanagement.py
test_oobm_issue_power_on Error 4.31 test_outofbandmanagement.py
test_oobm_issue_power_reset Error 4.32 test_outofbandmanagement.py
test_oobm_issue_power_reset Error 4.32 test_outofbandmanagement.py
test_oobm_issue_power_soft Error 4.36 test_outofbandmanagement.py
test_oobm_issue_power_soft Error 4.36 test_outofbandmanagement.py
test_oobm_issue_power_status Error 3.30 test_outofbandmanagement.py
test_oobm_issue_power_status Error 3.30 test_outofbandmanagement.py
test_oobm_multiple_mgmt_server_ownership Error 1.14 test_outofbandmanagement.py
test_oobm_multiple_mgmt_server_ownership Error 1.14 test_outofbandmanagement.py
test_oobm_zchange_password Error 1.17 test_outofbandmanagement.py
test_oobm_zchange_password Error 1.17 test_outofbandmanagement.py
test_02_edit_primary_storage_tags Error 0.01 test_primary_storage.py
test_01_primary_storage_scope_change Error 0.07 test_primary_storage_scope.py
test_01_vpc_privategw_acl Error 0.03 test_privategw_acl_ovs_gre.py
test_03_vpc_privategw_restart_vpc_cleanup Error 0.02 test_privategw_acl_ovs_gre.py
test_05_vpc_privategw_check_interface Error 0.02 test_privategw_acl_ovs_gre.py
test_01_vpc_privategw_acl Error 55.70 test_privategw_acl.py
test_02_vpc_privategw_static_routes Error 208.13 test_privategw_acl.py
test_03_vpc_privategw_restart_vpc_cleanup Error 202.28 test_privategw_acl.py
test_04_rvpc_privategw_static_routes Error 321.51 test_privategw_acl.py
test_01_purge_expunged_api_vm_start_date Error 47.88 test_purge_expunged_vms.py
test_02_purge_expunged_api_vm_end_date Error 44.28 test_purge_expunged_vms.py
test_03_purge_expunged_api_vm_start_end_date Error 43.05 test_purge_expunged_vms.py
test_04_purge_expunged_api_vm_no_date Error 41.03 test_purge_expunged_vms.py
test_05_purge_expunged_vm_service_offering Error 269.14 test_purge_expunged_vms.py
test_06_purge_expunged_vm_background_task Error 333.31 test_purge_expunged_vms.py
test_01_snapshot_root_disk Error 7.37 test_snapshots.py
test_CreateTemplateWithDuplicateName Error 19.75 test_templates.py
test_01_register_template_direct_download_flag Error 0.15 test_templates.py
test_01_positive_tests_usage Error 13.49 test_usage_events.py
test_01_ISO_usage Error 1.08 test_usage.py
test_01_lb_usage Error 5.26 test_usage.py
test_01_nat_usage Error 7.33 test_usage.py
test_01_public_ip_usage Error 1.08 test_usage.py
test_01_snapshot_usage Error 25.75 test_usage.py
test_01_template_usage Error 17.09 test_usage.py
test_01_vm_usage Error 136.44 test_usage.py
test_01_volume_usage Error 125.08 test_usage.py
test_01_vpn_usage Error 9.52 test_usage.py
test_12_start_vm_multiple_volumes_allocated Error 12.61 test_vm_life_cycle.py
test_01_vmschedule_create Error 0.10 test_vm_schedule.py
test_disable_oobm_ha_state_ineligible Error 0.05 test_hostha_kvm.py
test_hostha_configure_default_driver Error 0.04 test_hostha_kvm.py
test_hostha_enable_ha_when_host_disabled Error 0.04 test_hostha_kvm.py
test_hostha_enable_ha_when_host_disconected Error 0.04 test_hostha_kvm.py
test_hostha_enable_ha_when_host_in_maintenance Error 0.04 test_hostha_kvm.py
test_hostha_kvm_host_degraded Error 0.04 test_hostha_kvm.py
test_hostha_kvm_host_fencing Error 0.04 test_hostha_kvm.py
test_hostha_kvm_host_recovering Error 0.05 test_hostha_kvm.py
test_remove_ha_provider_not_possible Error 0.04 test_hostha_kvm.py

blueorangutan avatar Jul 13 '24 02:07 blueorangutan

@blueorangutan test alma9 kvm-alma9

DaanHoogland avatar Jul 13 '24 19:07 DaanHoogland

@DaanHoogland a [SL] Trillian-Jenkins test job (alma9 mgmt + kvm-alma9) has been kicked to run smoke tests

blueorangutan avatar Jul 13 '24 19:07 blueorangutan

[SF] Trillian test result (tid-10823) Environment: kvm-alma9 (x2), Advanced Networking with Mgmt server a9 Total time taken: 51389 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr8739-t10823-kvm-alma9.zip Smoke tests completed. 136 look OK, 1 have errors, 0 did not run Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_06_purge_expunged_vm_background_task Failure 332.40 test_purge_expunged_vms.py

blueorangutan avatar Jul 14 '24 10:07 blueorangutan