awx
awx copied to clipboard
Race condition when trying to delete inventory
Please confirm the following
- [X] I agree to follow this project's code of conduct.
- [X] I have checked the current issues for duplicates.
- [X] I understand that AWX is open source software provided for free and that I might not receive a timely response.
- [X] I am NOT reporting a (potential) security vulnerability. (These should be emailed to
[email protected]
instead.)
Bug Summary
TASK [inventory_source_update : Delete Inventory] ******************************
task path: /home/runner/.ansible/collections/ansible_collections/awx/awx/tests/output/.tmp/integration/inventory_source_update-brclamnz-ÅÑŚÌβŁÈ/tests/integration/targets/inventory_source_update/tasks/main.yml:125
Using module file /home/runner/.ansible/collections/ansible_collections/awx/awx/plugins/modules/inventory.py
Pipelining is enabled.
<testhost> ESTABLISH LOCAL CONNECTION FOR USER: runner
<testhost> EXEC /bin/sh -c '/usr/bin/python3 && sleep 0'
The full traceback is:
File "/tmp/ansible_inventory_payload_kngr49df/ansible_inventory_payload.zip/ansible_collections/awx/awx/plugins/module_utils/controller_api.py", line 506, in make_request
response = self.session.open(
File "/tmp/ansible_inventory_payload_kngr49df/ansible_inventory_payload.zip/ansible/module_utils/urls.py", line 1578, in open
r = urllib_request.urlopen(request, None, timeout)
File "/usr/lib/python3.10/urllib/request.py", line 216, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.10/urllib/request.py", line 525, in open
response = meth(req, response)
File "/usr/lib/python3.10/urllib/request.py", line 634, in http_response
response = self.parent.error(
File "/usr/lib/python3.10/urllib/request.py", line 563, in error
return self._call_chain(*args)
File "/usr/lib/python3.10/urllib/request.py", line 496, in _call_chain
result = func(*args)
File "/usr/lib/python3.10/urllib/request.py", line 643, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
fatal: [testhost]: FAILED! => {
"changed": false,
"invocation": {
"module_args": {
"controller_config_file": null,
"controller_host": null,
"controller_oauthtoken": null,
"controller_password": null,
"controller_username": null,
"copy_from": null,
"description": null,
"host_filter": null,
"input_inventories": null,
"instance_groups": null,
"kind": null,
"name": "AWX-Collection-tests-inventory_source_update-inv-sQwIOMDPNPTxvaZM",
"new_name": null,
"organization": "Default",
"prevent_instance_group_fallback": null,
"request_timeout": null,
"state": "absent",
"validate_certs": null,
"variables": null
}
},
"msg": "You don't have permission to DELETE to /api/v2/inventories/8/ (HTTP 403)."
}
https://github.com/ansible/awx/blob/devel/awx/api/views/inventory.py#L74
which have the RelatedJobsPreventDeleteMixin
which prevent deletion when job event processing are not completed https://github.com/ansible/awx/blob/devel/awx/api/views/mixin.py#L131
this cause flake in our CI
AWX version
awx: 24.1.1.dev24+gedd69123ed
Select the relevant components
- [ ] UI
- [ ] UI (tech preview)
- [X] API
- [ ] Docs
- [ ] Collection
- [ ] CLI
- [ ] Other
Installation method
kubernetes
Modifications
no
Ansible version
No response
Operating system
No response
Web browser
No response
Steps to reproduce
run our CI repeatly and sometime it fails https://github.com/TheRealHaoLiu/awx/actions/runs/8542600522/job/23404593232
Expected results
CI don't flake
Actual results
CI flakes
Additional information
No response
from @AlanCoding creating jobevent with invalid host is ok
I would think the fix for this would be on the collection side NOT the api side. We just need to wait/retry.
Maybe I could see API changes to return more context (i.e. the running job ids).
Agreed? Other ideas?
Since this has gone so long without getting fixed, we could use this chance to updated the wording here:
https://github.com/ansible/awx/blob/78fc23138a06d5db53a23e6b11b7f3e3e6f6efc5/awx_collection/plugins/module_utils/controller_api.py#L527
hopefully solved with the above PR