octoprint_deploy
octoprint_deploy copied to clipboard
Instance fails randomly
Hi,
I'm currently having a problem with one of my instances. The instance randomly fails and I need to restart the service to get it going again.
Here's what the output of systemctl status X5SA gives me:
X5SA.service - The snappy web interface for your 3D printer Loaded: loaded (/etc/systemd/system/X5SA.service; enabled; vendor preset: enabled) Active: failed (Result: signal) since Tue 2022-08-09 20:24:36 CEST; 12h ago Process: 822405 ExecStart=/home/thomas/OctoPrint/bin/octoprint serve --config=${CONFIGFILE} --basedir=${BASEDIR} --port=${PORT} (code=killed, signal=SEGV) Main PID: 822405 (code=killed, signal=SEGV) CPU: 12min 4.852s
Aug 09 19:28:47 octoprint-host octoprint[822405]: 2022-08-09 19:28:47,831 - octoprint.plugins.tracking - INFO - Sent tracking event ping, payload: {'octoprint_uptime': 125105, 'printer_state': 'OFFLINE'} Aug 09 19:43:42 octoprint-host octoprint[822405]: 2022-08-09 19:43:42,891 - octoprint.server.heartbeat - INFO - Server heartbeat <3 Aug 09 19:43:47 octoprint-host octoprint[822405]: 2022-08-09 19:43:47,833 - octoprint.plugins.tracking - INFO - Sent tracking event ping, payload: {'octoprint_uptime': 126005, 'printer_state': 'OFFLINE'} Aug 09 19:58:42 octoprint-host octoprint[822405]: 2022-08-09 19:58:42,892 - octoprint.server.heartbeat - INFO - Server heartbeat <3 Aug 09 19:58:47 octoprint-host octoprint[822405]: 2022-08-09 19:58:47,835 - octoprint.plugins.tracking - INFO - Sent tracking event ping, payload: {'octoprint_uptime': 126905, 'printer_state': 'OFFLINE'} Aug 09 20:13:42 octoprint-host octoprint[822405]: 2022-08-09 20:13:42,893 - octoprint.server.heartbeat - INFO - Server heartbeat <3 Aug 09 20:13:47 octoprint-host octoprint[822405]: 2022-08-09 20:13:47,844 - octoprint.plugins.tracking - INFO - Sent tracking event ping, payload: {'octoprint_uptime': 127805, 'printer_state': 'OFFLINE'} Aug 09 20:24:36 octoprint-host systemd[1]: X5SA.service: Main process exited, code=killed, status=11/SEGV Aug 09 20:24:36 octoprint-host systemd[1]: X5SA.service: Failed with result 'signal'. Aug 09 20:24:36 octoprint-host systemd[1]: X5SA.service: Consumed 12min 4.852s CPU time.
Does anyone have an idea where I should have a look at to get to the bottom of this? The other instance runs without a problem.
here's the content of the X5SA.service file:
[Unit]
Description=The snappy web interface for your 3D printer
After=network.online.target
Wants=network.online.target
[Service]
Environment="PORT=5002"
Environment="BASEDIR=/home/thomas//.X5SA"
Environment="CONFIGFILE=/home/thomas//.X5SA/config.yaml"
User=thomas
ExecStart=/home/thomas/OctoPrint/bin/octoprint serve --config=${CONFIGFILE} --basedir=${BASEDIR} --port=${PORT}
[Install]
WantedBy=multi-user.target
I compared it to the X5SAPro.service file and it's basically the same (except the different values of the variables of course)
Okay, I just looked at the dmesg output and found this: [Tue Aug 9 20:24:35 2022] octoprint[822405]: segfault at 7f6a905d23d0 ip 000000000051d2bc sp 00007fff6263e550 error 6 in python3.9[41f000+288000] [Tue Aug 9 20:24:35 2022] Code: 4c 8b 6f 40 83 c2 01 48 8d 9f 68 01 00 00 41 89 94 24 b8 00 00 00 4c 39 eb 73 22 48 8b 3b 48 85 ff 74 11 48 c7 03 00 00 00 00 <48> 83 2f 01 0f 84 ba 00 00 00 48 83 c3 08 4c 39 eb 72 de 48 83 7d
Not something I have seen before, even with running many instances. It is possible there is a memory issues, which can give rise to segfaults. You could try running top
and seeing what is happening with memory usage as the two instances run.
I'll have a look at the memory usage, but there should be enough memory for those two instances.
Maybe a memory stick is bad, since this is an old laptop I'm currently using.
Just a couple of minutes ago the other instance crashed (same error in dmesg output).
I'll have a look at the memory usage, but there should be enough memory for those two instances.
Maybe a memory stick is bad, since this is an old laptop I'm currently using.
Just a couple of minutes ago the other instance crashed (same error in dmesg output).
yeah, some bad memory might be what you are looking at here.