octoprint_deploy icon indicating copy to clipboard operation
octoprint_deploy copied to clipboard

Instance fails randomly

Open taker218 opened this issue 2 years ago • 4 comments

Hi,

I'm currently having a problem with one of my instances. The instance randomly fails and I need to restart the service to get it going again.

Here's what the output of systemctl status X5SA gives me:

X5SA.service - The snappy web interface for your 3D printer Loaded: loaded (/etc/systemd/system/X5SA.service; enabled; vendor preset: enabled) Active: failed (Result: signal) since Tue 2022-08-09 20:24:36 CEST; 12h ago Process: 822405 ExecStart=/home/thomas/OctoPrint/bin/octoprint serve --config=${CONFIGFILE} --basedir=${BASEDIR} --port=${PORT} (code=killed, signal=SEGV) Main PID: 822405 (code=killed, signal=SEGV) CPU: 12min 4.852s

Aug 09 19:28:47 octoprint-host octoprint[822405]: 2022-08-09 19:28:47,831 - octoprint.plugins.tracking - INFO - Sent tracking event ping, payload: {'octoprint_uptime': 125105, 'printer_state': 'OFFLINE'} Aug 09 19:43:42 octoprint-host octoprint[822405]: 2022-08-09 19:43:42,891 - octoprint.server.heartbeat - INFO - Server heartbeat <3 Aug 09 19:43:47 octoprint-host octoprint[822405]: 2022-08-09 19:43:47,833 - octoprint.plugins.tracking - INFO - Sent tracking event ping, payload: {'octoprint_uptime': 126005, 'printer_state': 'OFFLINE'} Aug 09 19:58:42 octoprint-host octoprint[822405]: 2022-08-09 19:58:42,892 - octoprint.server.heartbeat - INFO - Server heartbeat <3 Aug 09 19:58:47 octoprint-host octoprint[822405]: 2022-08-09 19:58:47,835 - octoprint.plugins.tracking - INFO - Sent tracking event ping, payload: {'octoprint_uptime': 126905, 'printer_state': 'OFFLINE'} Aug 09 20:13:42 octoprint-host octoprint[822405]: 2022-08-09 20:13:42,893 - octoprint.server.heartbeat - INFO - Server heartbeat <3 Aug 09 20:13:47 octoprint-host octoprint[822405]: 2022-08-09 20:13:47,844 - octoprint.plugins.tracking - INFO - Sent tracking event ping, payload: {'octoprint_uptime': 127805, 'printer_state': 'OFFLINE'} Aug 09 20:24:36 octoprint-host systemd[1]: X5SA.service: Main process exited, code=killed, status=11/SEGV Aug 09 20:24:36 octoprint-host systemd[1]: X5SA.service: Failed with result 'signal'. Aug 09 20:24:36 octoprint-host systemd[1]: X5SA.service: Consumed 12min 4.852s CPU time.

Does anyone have an idea where I should have a look at to get to the bottom of this? The other instance runs without a problem.

here's the content of the X5SA.service file:

[Unit]
Description=The snappy web interface for your 3D printer
After=network.online.target
Wants=network.online.target

[Service]
Environment="PORT=5002"
Environment="BASEDIR=/home/thomas//.X5SA"
Environment="CONFIGFILE=/home/thomas//.X5SA/config.yaml"
User=thomas
ExecStart=/home/thomas/OctoPrint/bin/octoprint serve --config=${CONFIGFILE} --basedir=${BASEDIR} --port=${PORT}

[Install]
WantedBy=multi-user.target

I compared it to the X5SAPro.service file and it's basically the same (except the different values of the variables of course)

taker218 avatar Aug 10 '22 07:08 taker218

Okay, I just looked at the dmesg output and found this: [Tue Aug 9 20:24:35 2022] octoprint[822405]: segfault at 7f6a905d23d0 ip 000000000051d2bc sp 00007fff6263e550 error 6 in python3.9[41f000+288000] [Tue Aug 9 20:24:35 2022] Code: 4c 8b 6f 40 83 c2 01 48 8d 9f 68 01 00 00 41 89 94 24 b8 00 00 00 4c 39 eb 73 22 48 8b 3b 48 85 ff 74 11 48 c7 03 00 00 00 00 <48> 83 2f 01 0f 84 ba 00 00 00 48 83 c3 08 4c 39 eb 72 de 48 83 7d

taker218 avatar Aug 10 '22 07:08 taker218

Not something I have seen before, even with running many instances. It is possible there is a memory issues, which can give rise to segfaults. You could try running top and seeing what is happening with memory usage as the two instances run.

paukstelis avatar Aug 10 '22 11:08 paukstelis

I'll have a look at the memory usage, but there should be enough memory for those two instances.

Maybe a memory stick is bad, since this is an old laptop I'm currently using.

Just a couple of minutes ago the other instance crashed (same error in dmesg output).

taker218 avatar Aug 10 '22 11:08 taker218

I'll have a look at the memory usage, but there should be enough memory for those two instances.

Maybe a memory stick is bad, since this is an old laptop I'm currently using.

Just a couple of minutes ago the other instance crashed (same error in dmesg output).

yeah, some bad memory might be what you are looking at here.

paukstelis avatar Aug 10 '22 12:08 paukstelis