awx
awx copied to clipboard
Job failure as minikube logs rotate
Please confirm the following
- [X] I agree to follow this project's code of conduct.
- [X] I have checked the current issues for duplicates.
- [X] I understand that AWX is open source software provided for free and that I might not receive a timely response.
- [X] I am NOT reporting a (potential) security vulnerability. (These should be emailed to
[email protected]
instead.)
Bug Summary
On an AWX/minikube deployment I must use a playbook implementation workaround to avoid job failure due to minikube job pod's log rotating each 200Mo or so.
1/ the --log_file_max_size specification at minikube start does not work (minikube 1.32)
Observed with long loops (800 items) over small included_task playbook with incremental set_facts within (I use included_tasks to overcome RAM OOM pod failure appening within a single playbook).
The workaround is to place a no_log at my set_facts tasks to avoid logs bloat.
Hope having been clear.
Best regards,
AWX version
24.2.0
Select the relevant components
- [ ] UI
- [ ] UI (tech preview)
- [ ] API
- [ ] Docs
- [ ] Collection
- [ ] CLI
- [X] Other
Installation method
minikube
Modifications
no
Ansible version
No response
Operating system
No response
Web browser
No response
Steps to reproduce
Long loops (800 items) over small included_task playbook and an incremental set_fact (with combine) in this included playbook :
main playbook : - name: Boucle SNMP sur les équipements no_log: true ansible.builtin.include_tasks: file: gather_snmp_facts.yml loop: "{{ network_eqpts | dict2items }}"
included playbook (gather_snmp_facts.yml) : - name: Enrichissement du dictionnaire d'éqpts (os+version) # Les boucles set_fact génèrent une grande quantité de logs. # Il est préférable des les désactiver pour les gros playbooks # au risque d'un plantage du job AWX (BBO 12/04/2024) no_log: true vars: ios: "{{ stdout_cisco.stdout | regex_search('Cisco IOS(?!.+IOSXE)') }}" iosxe: "{{ stdout_cisco.stdout | regex_search('IOSXE') }}" nxos: "{{ stdout_cisco.stdout | regex_search('NX-OS') }}" # Pour les nexans, on identife le constructeur par # le descriptif 54VDC (switchs intégrés dans les goulottes courant fort) nexans: "{{ stdout_cisco.stdout | regex_search('54VDC') }}" checkpoint: "{{ stdout_checkpoint.stdout | default('') | regex_search('cpx86_64') }}" cisco_version: "{{ stdout_cisco.stdout | regex_search('(?i)(?<=version )([^ ,]+)') }}" checkpoint_version: "{{ stdout_checkpoint.stdout | default('') | regex_search('(?i)([^ ]+)(?=cpx86_64)') }}" ansible.builtin.set_fact: network_eqpts: "{{ network_eqpts | combine( { item.key : item.value | combine ({'os': {'Cisco IOS': 'IOS', 'IOSXE': 'IOS-XE', 'NX-OS': 'NX-OS', '54VDC': 'Nexans', 'cpx86_64': 'Checkpoint'}[ios+iosxe+nxos+nexans+checkpoint] | default(''), 'version': cisco_version+checkpoint_version}) } ) }}"
(the pod logs shows the set_facts logs repeats the complete fact for each iteration even if it does not show at stdout)
Expected results
Jobs goes to the end of the playbook
Actual results
Jobs is interrupted (as failure) as pod logs rotate (and stdout is truncated).
Additional information
No response
do you have receptor reconnect feature enabled?
see this original PR that also explains how to enable it https://github.com/ansible/receptor/pull/683
Hi,
Not tried I must say as I supposed the behaviour should have been ok with the default configured RECEPTOR_KUBE_SUPPORT_RECONNECT on "auto" as it seems suggested by this thread.
I will give it a try.
I can not reproduce the pb right now. I close it for now and will see ...
Best regards,