viya-ark icon indicating copy to clipboard operation
viya-ark copied to clipboard

SAS Viya 3.5 sas-viya-launcher-default service is not stopped by viya-services-stop.yml

Open LaimonasReklaitis opened this issue 2 years ago • 1 comments

2022-10-11 we installed SAS Viya 3.5 on our servers. 2022-10-13 we downloaded and started to use viya-ark-master version (took from https://github.com/sassoftware/viya-ark)

We noticed that with command: ansible-playbook viya-ark/playbooks/viya-mmsu/viya-services-stop.yml all services are shutting down except sas-viya-launcher-default service:

fatal: [compute]: FAILED! => { "msg": [ "Please examine the following stray process(es)", "If enable_stray_cleanup=true, process will be cleaned up automatically", "except database processes which require fix manually to avoid data corruption.", "This playbook can be rerun to clean up the child processes.", [ "sas 3891757 1 0 Oct14 ? 00:02:33 /opt/sas/spre/home/SASFoundation/utilities/bin/launcher --ssl --ssl-certificate /opt/sas/viya/config/etc/SASSecurityCertificateFramework/tls/certs/tklauncher/default/certificate.pem --ssl-private-key /opt/sas/viya/config/etc/SASSecurityCertificateFramework/tls/certs/tklauncher/default/key.pem --ssl-private-key-password /opt/sas/viya/config/etc/SASSecurityCertificateFramework/private/tklauncher/default/key.password --ssl-ca-list /opt/sas/viya/config/etc/SASSecurityCertificateFramework/cacerts/trustedcerts.pem --hostname ls7a.ls.net --bind-address 0.0.0.0 --registration-port 41523 --sas-services-url https://ls5a.ls.net:443/ --log /opt/sas/viya/config/var/log/tklauncher/default/tklauncher_2022-10-14_08-47-57.log", "root 3922505 3891759 0 Oct14 ? 00:00:00 /opt/sas/spre/home/SASFoundation/utilities/bin/sasauth", "root 3922509 3891759 0 Oct14 ? 00:00:00 /opt/sas/spre/home/SASFoundation/utilities/bin/sasauth" ] ] }

if we use command: ansible-playbook viya-ark-master/playbooks/viya-mmsu/viya-services-stop.yml -e "enable_stray_cleanup=true" then sas-viya-launcher-default service and its processes are cleaned:

ok: [compute] => { "msg": [ "The following stray processes have been cleaned up", [ "sas 3891757 1 0 Oct14 ? 00:02:33 /opt/sas/spre/home/SASFoundation/utilities/bin/launcher --ssl --ssl-certificate /opt/sas/viya/config/etc/SASSecurityCertificateFramework/tls/certs/tklauncher/default/certificate.pem --ssl-private-key /opt/sas/viya/config/etc/SASSecurityCertificateFramework/tls/certs/tklauncher/default/key.pem --ssl-private-key-password /opt/sas/viya/config/etc/SASSecurityCertificateFramework/private/tklauncher/default/key.password --ssl-ca-list /opt/sas/viya/config/etc/SASSecurityCertificateFramework/cacerts/trustedcerts.pem --hostname ls7a.ls.net --bind-address 0.0.0.0 --registration-port 41523 --sas-services-url https://ls5a.ls.net:443/ --log /opt/sas/viya/config/var/log/tklauncher/default/tklauncher_2022-10-14_08-47-57.log", "root 3922505 3891759 0 Oct14 ? 00:00:00 /opt/sas/spre/home/SASFoundation/utilities/bin/sasauth", "root 3922509 3891759 0 Oct14 ? 00:00:00 /opt/sas/spre/home/SASFoundation/utilities/bin/sasauth" ] ] }

Environment

  • [ ] Ansible version: core 2.12.2
  • [ ] Python version: 3.8.12
  • [ ] OS version: RedHat 8.6 x86_64
  • [ ] Failed playbook tasks log (or entire playbook log) [Attach]
  • [ ] What version of Viya 3.x is being deployed? Viya 3.5

To Reproduce Steps to reproduce the behavior:

  1. Download viya-ark-master from https://github.com/sassoftware/viya-ark
  2. Install SAS Viya 3.5
  3. Try to stop SAS Viya 3.5 services with command: ansible-playbook viya-ark/playbooks/viya-mmsu/viya-services-stop.yml

Expected behavior We expect all SAS Viya 3.5 services to be stoped with command: ansible-playbook viya-ark/playbooks/viya-mmsu/viya-services-stop.yml deployment.log

LaimonasReklaitis avatar Oct 21 '22 05:10 LaimonasReklaitis

At this time that is the expected behavior of the stop playbook, it is a cautious approach because forcefully stopping services that don't want to be stopped could be a potentially destructive behavior depending on the service. Its purpose is to allow the administrator/user of the system to determine whether it's ok to proceed to kill such processes and is why the parameter exists for enable_stray_cleanup=true

erharb avatar Oct 24 '22 13:10 erharb

Hmm. There is something I don't understand. I made one experiment:

  1. Stop sas-viya-launcher-default service on stateless server

See current status:

systemctl status sas-viya-launcher-default

sas-viya-launcher-default.service - LSB: start and stop sas-launcher service Loaded: loaded (/etc/rc.d/init.d/sas-viya-launcher-default; generated) Active: active (running) since Thu 2022-10-26 10:26:52 CEST; 4s ago

Stop service:

systemctl stop sas-viya-launcher-default

Check status once again to be sure it's dead:

systemctl status sas-viya-launcher-default

sas-viya-launcher-default.service - LSB: start and stop sas-launcher service Loaded: loaded (/etc/rc.d/init.d/sas-viya-launcher-default; generated) Active: inactive (dead)

  1. Stop SAS Viya 3.5 environment

With viya-ark-master playbook script:

ansible-playbook viya-ark-master/playbooks/viya-mmsu/viya-services-stop.yml

Result: fatal: [compute]: FAILED! => { "msg": [ "Please examine the following stray process(es)", "If enable_stray_cleanup=true, process will be cleaned up automatically", "except database processes which require fix manually to avoid data corruption.", "This playbook can be rerun to clean up the child processes.", [ "sas 2399029 1 0 10:06 ? 00:00:00 /opt/sas/spre/home/SASFoundation/utilities/bin/launcher --ssl --ssl-certificate /opt/sas/viya/config/etc/SASSecurityCertificateFramework/tls/certs/tklauncher/default/certificate.pem --ssl-private-key /opt/sas/viya/config/etc/SASSecurityCertificateFramework/tls/certs/tklauncher/default/key.pem --ssl-private-key-password /opt/sas/viya/config/etc/SASSecurityCertificateFramework/private/tklauncher/default/key.password --ssl-ca-list /opt/sas/viya/config/etc/SASSecurityCertificateFramework/cacerts/trustedcerts.pem --hostname ls7a.ls.net --bind-address 0.0.0.0 --registration-port 34455 --sas-services-url https://ls5a.ls.net:443/ --log /opt/sas/viya/config/var/log/tklauncher/default/tklauncher_2022-10-26_10-06-52.log" ] ] }

On compute server:

cd /opt/sas/viya/config/var/log/tklauncher/default/

tail tklauncher_2022-10-26_10-06-52.log

2022-10-26T10:40:30,356 ERROR [00000006] App.tk.launcher (launcher.c:1358) - SAS Launcher: Exception occurred: code=8 (0x8) 2022-10-26T10:40:30,356 ERROR [00000006] App.tk.launcher (launcher.c:1362) - SAS Launcher: Exception description: Control break 2022-10-26T10:40:30,356 INFO [00000006] App.tk.launcher (launcher.c:1455) - SAS Launcher: Terminating... 2022-10-26T10:40:30,356 INFO [00000006] App.tk.launcher (launcher.c:1532) - SAS Launcher: Waiting up to 10 seconds before forcing termination... 2022-10-26T10:40:30,356 ERROR [00000008] App.tk.launcher (launcher.c:2204) - SAS Launcher: Terminating connection manager: status=0x803FD015 2022-10-26T10:40:30,364 ERROR [00000007] App.tk.launcher (launcher.c:1812) - SAS Launcher: Terminating resource manager: status=0x0

Question is why viya-ark-master script can not clean softly (without using parameter -e "enable_stray_cleanup=true") if service sas-viya-launcher-default is already dead?

LaimonasReklaitis avatar Oct 27 '22 12:10 LaimonasReklaitis

@LaimonasReklaitis Thank you to report the problem. We have identified the issues and we'll work on the code to improve it. Thank you and have a nice day!

cuddlehub avatar Nov 09 '22 15:11 cuddlehub

My apologies, it seems the code to fix this somehow got missed in internal merging activities. We'll be correcting this oversight in an upcoming release.

erharb avatar Jul 28 '23 15:07 erharb

This issue has been addressed in Release Viya35-ark-1.17

kevinlinglesas avatar Aug 02 '23 18:08 kevinlinglesas