cloudstack
cloudstack copied to clipboard
server: throw exception if fail to cleanup IP resources when release a public IP
Description
When reproduced the issue #8967, I got the following error
2024-05-03T14:24:16,885 WARN [c.c.n.IpAddressManagerImpl] (API-Job-Executor-44:[ctx-6da31d57, job-15369, ctx-eb716451]) (logid:d43b402b) Unable to revoke all the firewall rules for ip id=2 as a part of ip release
2024-05-03T14:24:29,282 DEBUG [c.c.n.IpAddressManagerImpl] (API-Job-Executor-44:[ctx-6da31d57, job-15369, ctx-eb716451]) (logid:d43b402b) Releasing ip id=2; sourceNat = false
2024-05-03T14:24:29,271 WARN [c.c.n.IpAddressManagerImpl] (API-Job-Executor-44:[ctx-6da31d57, job-15369, ctx-eb716451]) (logid:d43b402b) Failed to release resources for ip address id=2
2024-05-03T14:24:36,266 WARN [c.c.n.NetworkServiceImpl] (API-Job-Executor-44:[ctx-6da31d57, job-15369, ctx-eb716451]) (logid:d43b402b) Failed to release public ip address id=2
The errors are ignored, public IP is released successfully in cloudstack, but the IP is still associated to a VR. When associate the IP to another network, it caused an issue similar as #8967. However, the reporter of #8967 could not find any error like "Failed to release" or "Unable to revoke" in their logs, so the root cause of #8967 could be different.
Types of changes
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] New feature (non-breaking change which adds functionality)
- [x] Bug fix (non-breaking change which fixes an issue)
- [ ] Enhancement (improves an existing feature and functionality)
- [ ] Cleanup (Code refactoring and cleanup, that may add test cases)
- [ ] build/CI
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
- [ ] Major
- [ ] Minor
Bug Severity
- [ ] BLOCKER
- [ ] Critical
- [ ] Major
- [ ] Minor
- [ ] Trivial
Screenshots (if appropriate):
How Has This Been Tested?
How did you try to break this feature and the system with this change?
@blueorangutan package
@weizhouapache a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.
Codecov Report
Attention: Patch coverage is 0% with 6 lines in your changes missing coverage. Please review.
Project coverage is 14.95%. Comparing base (
ea9a0f4) to head (8d0aab5). Report is 199 commits behind head on 4.19.
| Files | Patch % | Lines |
|---|---|---|
| ...n/java/com/cloud/network/IpAddressManagerImpl.java | 0.00% | 5 Missing :warning: |
| ...n/java/com/cloud/network/dao/IPAddressDaoImpl.java | 0.00% | 1 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## 4.19 #9059 +/- ##
============================================
- Coverage 14.96% 14.95% -0.01%
- Complexity 10995 11017 +22
============================================
Files 5373 5382 +9
Lines 469024 470133 +1109
Branches 58818 59924 +1106
============================================
+ Hits 70197 70320 +123
- Misses 391056 392024 +968
- Partials 7771 7789 +18
| Flag | Coverage Δ | |
|---|---|---|
| uitests | 4.28% <ø> (-0.04%) |
:arrow_down: |
| unittests | 15.66% <0.00%> (-0.01%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Packaging result [SF]: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 9576
@blueorangutan test matrix
@weizhouapache a [SL] Trillian-Jenkins matrix job (centos7 mgmt + xenserver71, rocky8 mgmt + vmware67u3, centos7 mgmt + kvmcentos7) has been kicked to run smoke tests
[SF] Trillian test result (tid-10194) Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7 Total time taken: 43853 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr9059-t10194-kvm-centos7.zip Smoke tests completed. 130 look OK, 1 have errors, 0 did not run Only failed and skipped tests results shown below:
| Test | Result | Time (s) | Test File |
|---|---|---|---|
| test_01_events_resource | Error |
420.30 | test_events_resource.py |
[SF] Trillian test result (tid-10192) Environment: xenserver-71 (x2), Advanced Networking with Mgmt server 7 Total time taken: 47085 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr9059-t10192-xenserver-71.zip Smoke tests completed. 130 look OK, 1 have errors, 0 did not run Only failed and skipped tests results shown below:
| Test | Result | Time (s) | Test File |
|---|---|---|---|
| test_01_events_resource | Error |
336.85 | test_events_resource.py |
[SF] Trillian test result (tid-10193) Environment: vmware-67u3 (x2), Advanced Networking with Mgmt server r8 Total time taken: 50852 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr9059-t10193-vmware-67u3.zip Smoke tests completed. 128 look OK, 3 have errors, 0 did not run Only failed and skipped tests results shown below:
| Test | Result | Time (s) | Test File |
|---|---|---|---|
| test_01_events_resource | Error |
351.57 | test_events_resource.py |
| test_create_pvlan_network | Error |
0.09 | test_pvlan.py |
| test_02_trigger_shutdown | Failure |
341.73 | test_safe_shutdown.py |
@blueorangutan package
@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.
@blueorangutan package
@weizhouapache a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.
Packaging result [SF]: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 10083
Packaging result [SF]: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 10085
@blueorangutan test
@sureshanaparti a [SL] Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests
[SF] Trillian test result (tid-10634) Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7 Total time taken: 41982 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr9059-t10634-kvm-centos7.zip Smoke tests completed. 131 look OK, 0 have errors, 0 did not run Only failed and skipped tests results shown below:
| Test | Result | Time (s) | Test File |
|---|
@blueorangutan package
@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.
Packaging result [SF]: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 10195
This pull request has merge conflicts. Dear author, please fix the conflicts and sync your branch with the base branch.
@weizhouapache
I am not able to reproduce the behaviour, do you have any other specific steps
I have followed these steps
Create 2 sessions and execute the following API at the same time
- Execute disassociate IP address
- Execute creates a firewall rule on the same IP
Did not observe the log
egrep 'Failed to release | Unable to revoke ' /var/log/cloudstack/management/management-server.log
The create firewall API returns
Error: (HTTP 431, error code 4350) Unable to create firewall rule for the IP address ID=5 as IP is not associated with any network and no networkId is passed in
Also, I see the IPAddress present in the router even after the disassociate IP address is successfull
cat /etc/cloudstack/ips.json
}root@r-4-VM:~# cat /etc/cloudstack/ips.json
{
"eth0": [
{
"add": true,
"broadcast": "10.1.1.255",
"cidr": "10.1.1.1/24",
"device": "eth0",
"gateway": "",
"netmask": "255.255.255.0",
"network": "10.1.1.0/24",
"nic_dev_id": "0",
"nw_type": "guest",
"one_to_one_nat": false,
"public_ip": "10.1.1.1",
"size": "24",
"source_nat": false
}
],
"eth1": [
{
"add": true,
"broadcast": "169.254.255.255",
"cidr": "169.254.216.24/16",
"device": "eth1",
"gateway": "",
"netmask": "255.255.0.0",
"network": "169.254.0.0/16",
"nic_dev_id": "1",
"nw_type": "control",
"one_to_one_nat": false,
"public_ip": "169.254.216.24",
"size": "16",
"source_nat": false
}
],
"eth2": [
{
"add": true,
"broadcast": "10.0.63.255",
"cidr": "10.0.54.123/20",
"device": "eth2",
"first_i_p": true,
"gateway": "10.0.48.1",
"is_private_gateway": false,
"mtu": "1500",
"netmask": "255.255.240.0",
"network": "10.0.48.0/20",
"new_nic": false,
"nic_dev_id": 2,
"nw_type": "public",
"one_to_one_nat": false,
"public_ip": "10.0.54.123",
"size": "20",
"source_nat": true,
"vif_mac_address": "1e:00:89:00:00:03"
},
{
"add": false,
"broadcast": "10.0.63.255",
"cidr": "10.0.54.124/20",
"device": "eth2",
"first_i_p": false,
"gateway": "10.0.48.1",
"is_private_gateway": false,
"mtu": "1500",
"netmask": "255.255.240.0",
"network": "10.0.48.0/20",
"new_nic": false,
"nic_dev_id": 2,
"nw_type": "public",
"one_to_one_nat": false,
"public_ip": "10.0.54.124",
"size": "20",
"source_nat": false,
"vif_mac_address": "1e:00:89:00:00:03"
}
],
"id": "ips"
@weizhouapache
I am not able to reproduce the behaviour, do you have any other specific steps
I have followed these steps
Create 2 sessions and execute the following API at the same time
- Execute disassociate IP address
- Execute creates a firewall rule on the same IP
Did not observe the log
egrep 'Failed to release | Unable to revoke ' /var/log/cloudstack/management/management-server.log
The create firewall API returns
Error: (HTTP 431, error code 4350) Unable to create firewall rule for the IP address ID=5 as IP is not associated with any network and no networkId is passed inAlso, I see the IPAddress present in the router even after the disassociate IP address is successfull
cat /etc/cloudstack/ips.json
}root@r-4-VM:~# cat /etc/cloudstack/ips.json { "eth0": [ { "add": true, "broadcast": "10.1.1.255", "cidr": "10.1.1.1/24", "device": "eth0", "gateway": "", "netmask": "255.255.255.0", "network": "10.1.1.0/24", "nic_dev_id": "0", "nw_type": "guest", "one_to_one_nat": false, "public_ip": "10.1.1.1", "size": "24", "source_nat": false } ], "eth1": [ { "add": true, "broadcast": "169.254.255.255", "cidr": "169.254.216.24/16", "device": "eth1", "gateway": "", "netmask": "255.255.0.0", "network": "169.254.0.0/16", "nic_dev_id": "1", "nw_type": "control", "one_to_one_nat": false, "public_ip": "169.254.216.24", "size": "16", "source_nat": false } ], "eth2": [ { "add": true, "broadcast": "10.0.63.255", "cidr": "10.0.54.123/20", "device": "eth2", "first_i_p": true, "gateway": "10.0.48.1", "is_private_gateway": false, "mtu": "1500", "netmask": "255.255.240.0", "network": "10.0.48.0/20", "new_nic": false, "nic_dev_id": 2, "nw_type": "public", "one_to_one_nat": false, "public_ip": "10.0.54.123", "size": "20", "source_nat": true, "vif_mac_address": "1e:00:89:00:00:03" }, { "add": false, "broadcast": "10.0.63.255", "cidr": "10.0.54.124/20", "device": "eth2", "first_i_p": false, "gateway": "10.0.48.1", "is_private_gateway": false, "mtu": "1500", "netmask": "255.255.240.0", "network": "10.0.48.0/20", "new_nic": false, "nic_dev_id": 2, "nw_type": "public", "one_to_one_nat": false, "public_ip": "10.0.54.124", "size": "20", "source_nat": false, "vif_mac_address": "1e:00:89:00:00:03" } ], "id": "ips"
@kiranchavala Thanks This might has been fixed by #9234
Closing Reopen if needed