awx icon indicating copy to clipboard operation
awx copied to clipboard

Race condition when trying to delete inventory

Open TheRealHaoLiu opened this issue 10 months ago • 4 comments

Please confirm the following

  • [X] I agree to follow this project's code of conduct.
  • [X] I have checked the current issues for duplicates.
  • [X] I understand that AWX is open source software provided for free and that I might not receive a timely response.
  • [X] I am NOT reporting a (potential) security vulnerability. (These should be emailed to [email protected] instead.)

Bug Summary

TASK [inventory_source_update : Delete Inventory] ******************************
task path: /home/runner/.ansible/collections/ansible_collections/awx/awx/tests/output/.tmp/integration/inventory_source_update-brclamnz-ÅÑŚÌβŁÈ/tests/integration/targets/inventory_source_update/tasks/main.yml:125
Using module file /home/runner/.ansible/collections/ansible_collections/awx/awx/plugins/modules/inventory.py
Pipelining is enabled.
<testhost> ESTABLISH LOCAL CONNECTION FOR USER: runner
<testhost> EXEC /bin/sh -c '/usr/bin/python3 && sleep 0'
The full traceback is:
  File "/tmp/ansible_inventory_payload_kngr49df/ansible_inventory_payload.zip/ansible_collections/awx/awx/plugins/module_utils/controller_api.py", line 506, in make_request
    response = self.session.open(
  File "/tmp/ansible_inventory_payload_kngr49df/ansible_inventory_payload.zip/ansible/module_utils/urls.py", line 1578, in open
    r = urllib_request.urlopen(request, None, timeout)
  File "/usr/lib/python3.10/urllib/request.py", line 216, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.10/urllib/request.py", line 525, in open
    response = meth(req, response)
  File "/usr/lib/python3.10/urllib/request.py", line 634, in http_response
    response = self.parent.error(
  File "/usr/lib/python3.10/urllib/request.py", line 563, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.10/urllib/request.py", line 643, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
fatal: [testhost]: FAILED! => {
    "changed": false,
    "invocation": {
        "module_args": {
            "controller_config_file": null,
            "controller_host": null,
            "controller_oauthtoken": null,
            "controller_password": null,
            "controller_username": null,
            "copy_from": null,
            "description": null,
            "host_filter": null,
            "input_inventories": null,
            "instance_groups": null,
            "kind": null,
            "name": "AWX-Collection-tests-inventory_source_update-inv-sQwIOMDPNPTxvaZM",
            "new_name": null,
            "organization": "Default",
            "prevent_instance_group_fallback": null,
            "request_timeout": null,
            "state": "absent",
            "validate_certs": null,
            "variables": null
        }
    },
    "msg": "You don't have permission to DELETE to /api/v2/inventories/8/ (HTTP 403)."
}

https://github.com/ansible/awx/blob/devel/awx/api/views/inventory.py#L74

which have the RelatedJobsPreventDeleteMixin

which prevent deletion when job event processing are not completed https://github.com/ansible/awx/blob/devel/awx/api/views/mixin.py#L131

this cause flake in our CI

AWX version

awx: 24.1.1.dev24+gedd69123ed

Select the relevant components

  • [ ] UI
  • [ ] UI (tech preview)
  • [X] API
  • [ ] Docs
  • [ ] Collection
  • [ ] CLI
  • [ ] Other

Installation method

kubernetes

Modifications

no

Ansible version

No response

Operating system

No response

Web browser

No response

Steps to reproduce

run our CI repeatly and sometime it fails https://github.com/TheRealHaoLiu/awx/actions/runs/8542600522/job/23404593232

Expected results

CI don't flake

Actual results

CI flakes

Additional information

No response

TheRealHaoLiu avatar Apr 03 '24 18:04 TheRealHaoLiu

from @AlanCoding creating jobevent with invalid host is ok

TheRealHaoLiu avatar Apr 03 '24 18:04 TheRealHaoLiu

I would think the fix for this would be on the collection side NOT the api side. We just need to wait/retry.

Maybe I could see API changes to return more context (i.e. the running job ids).

Agreed? Other ideas?

chrismeyersfsu avatar Apr 03 '24 20:04 chrismeyersfsu

Since this has gone so long without getting fixed, we could use this chance to updated the wording here:

https://github.com/ansible/awx/blob/78fc23138a06d5db53a23e6b11b7f3e3e6f6efc5/awx_collection/plugins/module_utils/controller_api.py#L527

AlanCoding avatar Apr 26 '24 17:04 AlanCoding

hopefully solved with the above PR

fosterseth avatar Apr 26 '24 21:04 fosterseth