community.sap_install
community.sap_install copied to clipboard
sap_swpm: does not show the ouptut of install script
Hello The task that runs the install script ./sapinst should not be ran as async and then monitored with other tasks When the scripts fails, Ansible does not provide the stderr and stdout for this script, rendering the troubleshooting impossible There should be one task to run for the script
This task https://github.com/sap-linuxlab/community.sap_install/blob/main/roles/sap_swpm/tasks/swpm.yml#L64 should be:
- name: SAP SWPM - {{ sap_swpm_swpm_installation_header }} # noqa no-changed-when
ansible.builtin.shell: |
{{ __sap_swpm_sapinst_command }}
register: __sap_swpm_register_sapinst_async_job
args:
chdir: "{{ sap_swpm_sapinst_path }}"
async: 1800 # Maximum allowed time in Seconds (30 minutes)
poll: 30 # Polling Interval in Seconds
and remove the following tasks to monitor the script There is no need to retreive RC and output as the shell module already does this
# Monitor sapinst process (i.e. ps aux | grep sapinst) and wait for exit
- name: SAP SWPM - Wait for sapinst process to exit, poll every 60 seconds
community.general.pids:
name: sapinst
# shell: ps -ef | awk '/sapinst/&&!/awk/&&!/ansible/{print}'
register: pids_sapinst
until: "pids_sapinst.pids | length == 0"
# until: "pids_sapinst.stdout | length == 0"
retries: 1000
delay: 60
- name: SAP SWPM - Verify if sapinst process finished successfully
ansible.builtin.async_status:
jid: "{{ __sap_swpm_register_sapinst_async_job.ansible_job_id }}"
register: __sap_swpm_register_sapinst
failed_when: __sap_swpm_register_sapinst.finished != 1 or __sap_swpm_register_sapinst.rc != 0
# #until: __sap_swpm_register_sapinst.finished
# #retries: 1000
# #delay: 60
- name: SAP SWPM - Display the sapinst return code
ansible.builtin.debug:
msg: "{{ __sap_swpm_register_sapinst.rc }}"
- name: SAP SWPM - Display the sapinst output
ansible.builtin.debug:
msg: "{{ __sap_swpm_register_sapinst.stdout_lines }}"
when: sap_swpm_display_unattended_output
@ZouhirYachou sapinst can take many hours to run, the SSH Session Tunnel has a tendancy to timeout and therefore the Ansible Task never ends even when the sapinst process has ended. This is why async approach with checking the process has ended was used. This is explained in the commented code.
The default behaviour was altered upon request of other end-users, where the SWPM stdout/stderr upon error would wipe a terminal window if the scrollback buffer settings were too low (easily SWPM can output 10,000 lines to the terminal window).
Upon end-user request the following commit was created that introduced the variable set to not display output by default: https://github.com/sap-linuxlab/community.sap_install/commit/1861c15972abeeded7351ef41a9425823c39631e
If you use sap_swpm_display_unattended_output: true in your variables, you will see the output.
Even with the usage of the sap_swpm_display_unattended_output: true variable, my playbook fails before the task that show the output, therefore, no access to the logs
TASK [community.sap_install.sap_swpm : Display the sapinst command line] *******
ok: [vlh1bse26] => {
"msg": "SAP SWPM install command: 'umask 022 ; ./sapinst SAPINST_INPUT_PARAMETERS_URL=/tmp/ansible.jrzwy2moswpmconfig/inifile.params SAPINST_EXECUTE_PRODUCT_ID=NW_ABAP_ASCS:S4HANA2022.CORE.HDB.ABAP SAPINST_SKIP_DIALOGS=true SAPINST_START_GUISERVER=false '"
}
TASK [community.sap_install.sap_swpm : SAP SWPM -] *****************************
changed: [vlh1bse26]
TASK [community.sap_install.sap_swpm : SAP SWPM - Wait for sapinst process to exit, poll every 60 seconds] ***
ok: [vlh1bse26]
TASK [community.sap_install.sap_swpm : SAP SWPM - Verify if sapinst process finished successfully] ***
fatal: [vlh1bse26]: FAILED! =>
{
"ansible_job_id": "j325284186928.5920",
"changed": false,
"failed_when_result": true,
"finished": 0,
"results_file": "/root/.ansible_async/j325284186928.5920",
"started": 1,
"stderr": "",
"stderr_lines": [],
"stdout": "",
"stdout_lines": []
}
When I run the command manually on the host, I do not get any errors and the script gives a 0 return code
my proposition allows for the monitoring with a update on its status every 30 seconds (we can probably change the value for async to allow more than 30 minutes)
TASK [local_sap_swpm : Display the sapinst command line] ***********************
ok: [vlh1bse26] => {
"msg": "SAP SWPM install command: 'umask 022 ; ./sapinst SAPINST_INPUT_PARAMETERS_URL=/tmp/ansible.n18ypd4cswpmconfig/inifile.params SAPINST_EXECUTE_PRODUCT_ID=NW_ABAP_ASCS:S4HANA2022.CORE.HDB.ABAP SAPINST_SKIP_DIALOGS=true SAPINST_START_GUISERVER=false '"
}
TASK [local_sap_swpm : SAP SWPM -] *********************************************
ASYNC POLL on vlh1bse26: jid=j847759566754.5984 started=1 finished=0
ASYNC POLL on vlh1bse26: jid=j847759566754.5984 started=1 finished=0
ASYNC POLL on vlh1bse26: jid=j847759566754.5984 started=1 finished=0
ASYNC OK on vlh1bse26: jid=j847759566754.5984
changed: [vlh1bse26]
TASK [local_sap_swpm : SAP SWPM - Find last installation location] *************
ok: [vlh1bse26]
HI @ZouhirYachou , async:1800 is very optimistic. I have seen an S/4 install running for 3 hours in a cloud test environment with a slow database, so the async: 32400 makes total sense. if we set poll to 30, we might get the same result as if we watch the the process ending. At least we get a less confusing shell output. I do not know exactly the previous implementation. Still, I would suggest encapsulating the current and the suggested method in code blocks, which enables us to switch between the two by a variable. @ZouhirYachou 's suggestion is at least a cleaner implementation, that should become the default if it can be proven to be stable with the current ansible release. What do you think @berndfinger, @sean-freeman?
@ZouhirYachou something is not right in this output.... under ansible_job_id should be the executed cmd and a stdout/stderr entries.
Such as....
TASK [community.sap_install.sap_swpm : Display the sapinst command line] *********
ok: [nwas01] => {
"msg": "SAP SWPM install command: 'umask 022 ; ./sapinst SAPINST_INPUT_PARAMETERS_URL=/tmp/ansible.zm7n3b1gswpmconfig/inifile.params SAPINST_EXECUTE_PRODUCT_ID=NW_ABAP_OneHost:S4HANA2021.CORE.HDB.ABAP SAPINST_SKIP_DIALOGS=true SAPINST_START_GUISERVER=false '"
}
TASK [community.sap_install.sap_swpm : SAP SWPM -] ******************************
changed: [nwas01]
TASK [community.sap_install.sap_swpm : SAP SWPM - Wait for sapinst process to exit, poll every 60 seconds] **********
FAILED - RETRYING: [nwas01]: SAP SWPM - Wait for sapinst process to exit, poll every 60 seconds (1000 retries left).
ok: [nwas01]
TASK [community.sap_install.sap_swpm : SAP SWPM - Verify if sapinst process finished successfully] *********
fatal: [nwas01]: FAILED! =>
{
"ansible_job_id": "j444392358629.64741",
"changed": true,
"cmd": "umask 022 ; ./sapinst SAPINST_INPUT_PARAMETERS_URL=/tmp/ansible.zm7n3b1gswpmconfig/inifile.params SAPINST_EXECUTE_PRODUCT_ID=NW_ABAP_OneHost:S4HANA2021.CORE.HDB.ABAP SAPINST_SKIP_DIALOGS=true SAPINST_START_GUISERVER=false \n",
"failed_when_result": true,
"finished": 1,
"msg": "non-zero return code",
"rc": 111,
"results_file": "/root/.ansible_async/j444392358629.64741",
"start": "2023-06-30 18:41:40.436147",
"started": 1,
"stderr_lines": [
"=>sapparam(1c): No Profile used.",
"=>sapparam: SAPSYSTEMNAME neither in Profile nor in Commandline",
"################################################",
"Abort execution because of ",
"Step returns osmod.hosts.getHostByName",
"################################################"
],
"stdout_lines": [
"Extracting...",
"Extraction done!",
"SAPinst build information:"
....
....
"Removed directory /root/.sapinst/nwas01.example.com/64833."
]
}
@ZouhirYachou let's confirm a few things because I've not seen this behaviour before and the functionality of this Ansible Role has not changed (except for request to hide output, as shown in commit above + that has no impact on the debug you showed) in over 12 months.
-
sap_swpm_sapinst_pathis set to the directory path containing sapinst? e.g. if/software/sap_swpm_unpack/sapinstthen variable would besap_swpm_sapinst_path: /software/sap_swpm_unpack. -
Ansible Core and Python version, see example...
$ ansible-playbook --version
ansible-playbook [core 2.16.2]
python version = 3.11.7 (main, Dec 4 2023, 18:10:11) [Clang 15.0.0 (clang-1500.1.0.2.5)] (/Users/username/.py_venv/py_ansible/bin/python3)
jinja version = 3.1.2
libyaml = True
- Ansible Collections versions
ansible-galaxy collection list
N.B. Poll is set to 60 seconds, so that it is easier for end-user to mentally calculate how long the installation has taken. It Ansible waits 59 seconds too long on a 5 minute install, it's a bit annoying but on a 3 hour install it's unnoticeable.
Hello
The variable is set
sap_swpm_sapinst_path: /sapinst/swpm/sap_swpm_extracted/
Ansible version and python version: (we are using Ansible Automation platform 2.4 with Ansible EE 2.15)
bash-4.4# ansible --version
ansible [core 2.15.8]
config file = /etc/ansible/ansible.cfg
configured module search path = ['/home/runner/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /usr/lib/python3.9/site-packages/ansible
ansible collection location = /home/runner/.ansible/collections:/usr/share/ansible/collections
executable location = /usr/bin/ansible
python version = 3.9.18 (main, Sep 22 2023, 17:58:34) [GCC 8.5.0 20210514 (Red Hat 8.5.0-20)] (/usr/bin/python3.9)
jinja version = 3.1.2
libyaml = True
bash-4.4# python --version
Python 3.9.18
and the requirements.yml for the collections
collections:
- name: community.general
version: 6.5.0
- name: redhat.rhel_system_roles
version: 1.22.0
- name: community.sap_install
version: 1.4.0
I do not understand why we use 3 tasks and a poll 0 value when we could just use one task with a positive poll value since we do not run other tasks concurrently I can't explain the issue i'm having (empty output) but with my proposition, I do not have any issues running the script
@ZouhirYachou I explained this above. After a certain release of SAP SWPM 2.0 (SP10 I think), the Ansible Task that executed SAP SWPM would continue forever even though the sapinst process had exited successfully. It was almost impossible to diagnose, therefore a separation:
- an Ansible Task to execute
sapinstto run dettached (async 0) - another Ansible Task to watch/poll
sapinstprocess every 60 seconds, and use failed_when if the watch/poll.finishedwas not1or the.rcwas not0
I'll run an SAP SWPM today with false entries that triggers a failure, using the versions provided to replicate your issue
@ZouhirYachou I have attempted:
- multiple versions of Ansible Core and Python, to assess
ansible.builtin.async_statusAnsible Module - with version lock to
community.generalversion6.5.0to assesscommunity.general.pidsAnsible Module
I cannot replicate your output (and subsequent failure) from my laptop. Therefore I have to conclude there is something about the specific setup, and I must run a test from Ansible Automation Platform with Ansible EE 2.15
Can you please describe the steps you used to upload and execute your Playbook from AAP ? I've never used it before and want to be sure the setup is identical to yours
I have synced the sap_install collection to our internal Automation Hub and we then use it in AAP We used RedHat documentation for the setup
@ZouhirYachou which documentation specifically?
Like I said, I have never used AAP before and will need to setup everything identically to yours.
This documentation to configure the Hub with AAP https://access.redhat.com/documentation/en-us/red_hat_ansible_automation_platform/2.4/html/getting_started_with_automation_hub/configure-hub-primary#proc-configure-automation-hub-server-gui
and this documentation to sync content from ansible galaxy https://access.redhat.com/documentation/en-us/red_hat_ansible_automation_platform/2.4/html/managing_content_in_automation_hub/managing-cert-valid-content#assembly-creating-tokens-in-automation-hub
@Sean: It should be easier and possible to pull the EE and run from ansible-navigator. @Zouhir: In AAP it is recommended to create an AAP with the 3 collections derived from your EE and not bind mount the collection into the container (although this is possible and should work)