cloudstack
cloudstack copied to clipboard
prevent an NPE on an uninitialised TemplateObject
Description
This PR fixes an NPE seen in an evironment after an upgrade from 4.15 to 4.17. It is not clear if configuration mistakes where made, but this PR attempts to handle the NPE a bit.
Starting a stopped VM after ACS upgrade from 4.15.2 to 4.17.2 resulted in failure to start VM caused by NPE while starting the VR.
2024-01-02 22:40:32,765 ERROR [c.c.a.ApiAsyncJobDispatcher] (API-Job-Executor-56:ctx-d6145db6 job-312930) (logid:77c2bb82) Unexpected exception while executing org.apache.cloudstack.api.command.admin.vm.StartVMCmdByAdmin
java.lang.NullPointerException
at org.apache.cloudstack.storage.image.store.TemplateObject.getId(TemplateObject.java:111)
at org.apache.cloudstack.storage.volume.VolumeServiceImpl.createVolumeFromTemplateAsync(VolumeServiceImpl.java:1533)
at org.apache.cloudstack.engine.orchestration.VolumeOrchestrator.recreateVolume(VolumeOrchestrator.java:1583)
at org.apache.cloudstack.engine.orchestration.VolumeOrchestrator.prepare(VolumeOrchestrator.java:1689)
at com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:1179)
at com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:972)
at com.cloud.network.router.NetworkHelperImpl.start(NetworkHelperImpl.java:315)
at com.cloud.network.router.NetworkHelperImpl.startVirtualRouter(NetworkHelperImpl.java:394)
at com.cloud.network.router.NetworkHelperImpl.startRouters(NetworkHelperImpl.java:379)
at org.apache.cloudstack.network.router.deployment.RouterDeploymentDefinition.deployVirtualRouter(RouterDeploymentDefinition.java:209)
at com.cloud.network.element.VirtualRouterElement.prepare(VirtualRouterElement.java:285)
at org.apache.cloudstack.engine.orchestration.NetworkOrchestrator.prepareElement(NetworkOrchestrator.java:1591)
at org.apache.cloudstack.engine.orchestration.NetworkOrchestrator.prepareNic(NetworkOrchestrator.java:1946)
at org.apache.cloudstack.engine.orchestration.NetworkOrchestrator.prepare(NetworkOrchestrator.java:1880)
Types of changes
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] New feature (non-breaking change which adds functionality)
- [x] Bug fix (non-breaking change which fixes an issue)
- [ ] Enhancement (improves an existing feature and functionality)
- [ ] Cleanup (Code refactoring and cleanup, that may add test cases)
- [ ] build/CI
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
- [ ] Major
- [x] Minor
Bug Severity
- [ ] BLOCKER
- [ ] Critical
- [x] Major
- [ ] Minor
- [ ] Trivial
Screenshots (if appropriate):
How Has This Been Tested?
How did you try to break this feature and the system with this change?
clearly something wrong in vmware; investigating
[SF] Trillian test result (tid-9945) Environment: vmware-67u3 (x2), Advanced Networking with Mgmt server r8 Total time taken: 64704 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr8898-t9945-vmware-67u3.zip Smoke tests completed. 62 look OK, 19 have errors, 0 did not run Only failed and skipped tests results shown below:
oddly some of these test files are not in the smoke test directory
- test_deploy_vm.py
- test_escalations_templates.py
- test_host_annotations.py
- test_vm_ha
- test_vm_sync
and some should work
- [ ] test_affinity_groups_projects.py
- [ ] test_deploy_vm_root_resize.py
- [ ] test_global_settings.py
- [ ] test_host_maintenance.py
- [ ] test_network.py
- [ ] test_outofbandmanagement.py
- [ ] test_privategw_acl.py
- [ ] test_projects.py
- [ ] test_public_ip_range.py
- [ ] test_pvlan.py
- [ ] test_routers_network_ops.py
- [ ] test_templates.py
- [ ] test_vm_life_cycle.py (seen to fail on main lately)
- [ ] test_vpc_redundant.py
suspicion ; broken tests broke the env for the rest, doing a manual round on these.
Codecov Report
Attention: Patch coverage is 0%
with 8 lines
in your changes missing coverage. Please review.
Project coverage is 12.23%. Comparing base (
2339412
) to head (3a0d1a4
).
Files | Patch % | Lines |
---|---|---|
...cloudstack/storage/image/store/TemplateObject.java | 0.00% | 6 Missing :warning: |
...udstack/storage/image/TemplateDataFactoryImpl.java | 0.00% | 2 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## 4.18 #8898 +/- ##
============================================
- Coverage 12.23% 12.23% -0.01%
Complexity 9291 9291
============================================
Files 4698 4698
Lines 414257 414265 +8
Branches 52895 53365 +470
============================================
- Hits 50705 50703 -2
- Misses 357251 357261 +10
Partials 6301 6301
Flag | Coverage Δ | |
---|---|---|
unittests | 12.23% <0.00%> (-0.01%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
sugestion by @vishesh92 :
diff --git a/engine/storage/image/src/main/java/org/apache/cloudstack/storage/image/TemplateDataFactoryImpl.java b/engine/storage/image/src/main/java/org/apache/cloudstack/storage/image/TemplateDataFactoryImpl.java
index 492ec74382..30c0131be8 100644
--- a/engine/storage/image/src/main/java/org/apache/cloudstack/storage/image/TemplateDataFactoryImpl.java
+++ b/engine/storage/image/src/main/java/org/apache/cloudstack/storage/image/TemplateDataFactoryImpl.java
@@ -97,6 +97,9 @@ public class TemplateDataFactoryImpl implements TemplateDataFactory {
@Override
public TemplateInfo getTemplate(long templateId, DataStore store) {
VMTemplateVO templ = imageDataDao.findById(templateId);
+ if (templ == null) {
+ return null;
+ }
if (store == null && !templ.isDirectDownload()) {
TemplateObject tmpl = TemplateObject.getTemplate(templ, null, null);
return tmpl;
``
[SF] Trillian Build Failed (tid-10232)
Packaging result [SF]: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 9659
[SF] Trillian test result (tid-10251) Environment: vmware-67u3 (x2), Advanced Networking with Mgmt server r8 Total time taken: 48420 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr8898-t10251-vmware-67u3.zip Smoke tests completed. 110 look OK, 0 have errors, 0 did not run Only failed and skipped tests results shown below:
Test | Result | Time (s) | Test File |
---|
finally it passes. i don't know what else to do for this corner case so leaving it at this
@blueorangutan package
@harikrishna-patnala a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.
Packaging result [SF]: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 9693
[SF] Trillian test result (tid-10363) Environment: kvm-alma9 (x2), Advanced Networking with Mgmt server a9 Total time taken: 47282 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr8898-t10363-kvm-alma9.zip Smoke tests completed. 110 look OK, 0 have errors, 0 did not run Only failed and skipped tests results shown below:
Test | Result | Time (s) | Test File |
---|
@blueorangutan package
@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.
Packaging result [SF]: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 9814
@blueorangutan test alma9 vmware-67u3
@DaanHoogland a [SL] Trillian-Jenkins test job (alma9 mgmt + vmware-67u3) has been kicked to run smoke tests
[SF] Trillian test result (tid-10389) Environment: vmware-67u3 (x2), Advanced Networking with Mgmt server a9 Total time taken: 42698 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr8898-t10389-vmware-67u3.zip Smoke tests completed. 110 look OK, 0 have errors, 0 did not run Only failed and skipped tests results shown below:
Test | Result | Time (s) | Test File |
---|