cx_Freeze
cx_Freeze copied to clipboard
Unhandled python exceptions do not result in non-zero exit codes in Win32Services
We have a python program cx-frozen into a Win32Service. The service must be running all the time and needs to recover from crashes automatically.
In face of unhandled python exceptions (ie fatal error), the running service seems to exit with a zero exit code leading the service state to STOPPED state. This render any of the Windows Service Recovery strategies ineffective as they only respond to non-zero ExitCodes.
As a workaround, we have resorted to a secondary service that's always checking the state of the frozen service. If it's in STOPPED state, it restarts it. That's an extra layer of management we need to maintain.
Can the service api change in a way that the exit codes reflects more closely the outcome of python program's execution?
I think so. You can put more details like a sample? Do your service is based on the sample in the repository? Do you use the initialize, stop, run and state_chaged in the Handler class?
Hi Marcel,
I spent some time today replicating this.
Using latest release ( 6.3 ) + the service sample i did the following.
- Replaced
Run
with:
def Run(self):
raise RuntimeError("Doomsday")
self.stopEvent.set()
- Built the service as usual + installed it via admin cmd prompt
- I set the recovery options of the service
- I start the service by right clicking > start
- The service immediately stops
- Checking the log there is the one error
[15008] 2021/02/09 10:29:45.078 starting logging at level ERROR
[15008] 2021/02/09 10:29:45.143 Python exception encountered:
[15008] 2021/02/09 10:29:45.143 Internal Message: exception running service
[15008] 2021/02/09 10:29:45.144 Type => <class 'RuntimeError'>
[15008] 2021/02/09 10:29:45.144 Value => Doomsday
[15008] 2021/02/09 10:29:45.144 Traceback (most recent call last):
[15008] 2021/02/09 10:29:45.144 File "C:\King\repos\git\WinServiceTesting\cx_sample\service\ServiceHandler.py", line 27, in Run
raise RuntimeError("Doomsday")
[15008] 2021/02/09 10:29:45.146 RuntimeError: Doomsday
[04308] 2021/02/09 10:29:45.147 ending logging
- Windows makes no attempt to restart the service
In the scenario of an unhandled exception being raised in any of the Services methods ( Install / Start / Stop / Run / state/session change ) i would expect the process to exit with a None-Zero exit code so the windows recovery options can react to the services death.
Having a quick look at the underlying implementation it looks like it will always return 0....
I hope that all makes sense / please say if i we can do anything to help out! And obligatory - thank you so much for your work on this awesome package!
it always returns 0
Yes, when the service run is ok. The second line (and following lines) of the log, are output of: https://github.com/marcelotduarte/cx_Freeze/blob/f5be487e5db58d2a482b272313e6809a06b7e3d0/source/bases/Win32Service.c#L546
And LogPythonException returns -1 https://github.com/anthony-tuininga/cx_Logging/blob/master/src/cx_Logging.c#L1529-L1546
to https://github.com/marcelotduarte/cx_Freeze/blob/f5be487e5db58d2a482b272313e6809a06b7e3d0/source/bases/Win32Service.c#L579-L583 Then you should use state_changed to catch this, and in Config.py it should be set to true (defaults to false) https://github.com/marcelotduarte/cx_Freeze/blob/f5be487e5db58d2a482b272313e6809a06b7e3d0/cx_Freeze/samples/service/Config.py#L45
Hi Marcel
I tried ( i think! ) your suggestions but am still finding it always exits cleanly and doesn't trigger any recovery actions.
(btw When searching this git project i can't find any references to state_changed
anywhere....)
I added a logger to the Handler class to try your suggestion, please find the modified service sample below :)
With this + SESSION_CHANGES
set to True
- i never see any logs from StateChanged
/ state_changed
or / SessionChanged
*( in this sample - in our main product SessionChanged works as expected for logging events like user logins etc )
import threading
import katlog
import os
def _get_logger():
_log_name = "cx_service_test"
logger_obj = katlog.get_template_kat_logger(
_log_name,
output_directory=katlog.get_kat_tool_logs_subdir(_log_name),
file_name=_log_name,
log_file_prefix="pid{}".format(os.getpid()),
)
logger_obj.info("LOGGER Created '{}'".format(_log_name))
return logger_obj
class Handler(object):
LOGGER = None
def __init__(self):
self.stopEvent = threading.Event()
self.stopRequestedEvent = threading.Event()
def Initialize(self, configFileName):
Handler.LOGGER = _get_logger()
Handler.LOGGER.debug("WinService.Initialize: Loading Virtual Config: {}".format(configFileName))
Handler.LOGGER.debug("WinService.Initialize: Creating Instance")
def Run(self):
Handler.LOGGER.debug("WinService.Run: About To Explode")
raise RuntimeError("Doomsday")
Handler.LOGGER.debug("WinService.Run: After Explosion")
self.stopEvent.set()
def Stop(self):
Handler.LOGGER.debug("WinService.Stop: About To Stop")
self.stopRequestedEvent.set()
self.stopEvent.wait()
def StateChanged(self, *args, **kwargs):
Handler.LOGGER.debug("WinService.StateChanged: State Changed")
Handler.LOGGER.debug("{}, {}".format(args, kwargs))
def state_changed(self, *args, **kwargs):
Handler.LOGGER.debug("WinService.state_changed: State Changed")
Handler.LOGGER.debug("{}, {}".format(args, kwargs))
def SessionChanged(self, *args, **kwargs):
Handler.LOGGER.debug("WinService.SessionChanged: {} {}".format(args, kwargs))
Log Output...
2021-02-10 10:51:50,149: INFO: katLogging: katLogging.py: MainThread: cx_service_test: get_template_kat_logger: 59: Created new Logger cx_service_test
2021-02-10 10:51:50,150: INFO: ServiceHandler: ServiceHandler.py: MainThread: cx_service_test: _get_logger: 22: LOGGER Created 'cx_service_test'
2021-02-10 10:51:50,150: INFO: __init__: __init__.py: MainThread: cx_service_test: enable_api_logger: 25: Connected katapi logger.
2021-02-10 10:51:50,150: DEBUG: ServiceHandler: ServiceHandler.py: MainThread: cx_service_test: Initialize: 40: WinService.Initialize: Loading Virtual Config: C:\King\repos\git\WinServiceTesting\cx_sample\service\build\exe.win32-3.5\cx_FreezeSampleService.ini
2021-02-10 10:51:50,150: DEBUG: ServiceHandler: ServiceHandler.py: MainThread: cx_service_test: Initialize: 41: WinService.Initialize: Creating Instance
2021-02-10 10:51:50,151: DEBUG: ServiceHandler: ServiceHandler.py: MainThread: cx_service_test: Run: 50: WinService.Run: About To Explode
Side note.... Re:
// run the service
if (Service_Run(&info) < 0) {
Service_SetStatus(&info, SERVICE_STOPPED);
return;
}
Wouldn't this return 0/None/Null?
From a brief reading, it seems that it's possible to set the exit code from a stopped service in the call to Service_SetStatus()
.
According to the documentation for windows services status, it's possible to use the param dwWin32ExitCode
to indicate what the service should report when stopping. Currently, this value is always set to 0.
If the function Service_SetStatus()
was given another parameter to indicate the intended exit code, this could likely be fixed. At the moment I'm not able to test, verify and offer a concrete PR for this change - but hopefully, this will make the process easier for the person who's looking into this issue.
@darad @TechnicalPirate Can you test the @johron PR?
To install the latest development build:
pip install --upgrade --pre --extra-index-url https://marcelotduarte.github.io/packages/ cx_Freeze
Hey Marcel - thanks for flagging it - I will put it on my TODO list to test.
Thank you for taking this Johron! :)
Hi @marcelotduarte - I spent some time with Johron this week :)
- We reproduce the issue again on the latest version to confirm nothing had changed in the interim
- We then restored his PR / brought it up to the latest master
- Found some issues with it -> which we've now addressed
- I can now confirm that the PR is fresh & addresses the issue - the expected behavior is now being performed.
@TechnicalPirate Did you test today's last commit?
@marcelotduarte - Yep, we actually ended up on a call to go through and ensure we were both observing the same behaviour prior to making the change -> then confirming service recovery works as expected.
There is one small nuance that's worth commenting on here - even with the latest commit - there is an implicit assumption that the service has to enter a running state for recovery measures to kick-in if the service process encounters an unhandled exception during execution. ( Something we discovered during testing ).
But I can confirm that as of the latest commit a9e0df3 on that PR - everything is tested and working 👍