Multipass unable to stop
Describe the bug After starting my VM and running the iter_fitgal.py code the machine will begin to run the program seemingly "ok." However, sometimes the machine will occasionally "freeze" at a random point within the script and I become unable to stop the machine using multipass stop. This has happened before and every other time I have been unable to save the machine having to purge and re-install.
To Reproduce How, and what happened?
- xhost + _______
- multipass start ubuntu2004-1TB
- multipass exec ubuntu2004-1TB -- bash --login
- source activate envname
- python CODE/iter_fitgal.py
- code runs for a little bit
- FROZEN
- (In another terminal) multipass stop ubuntu2004-1TB
- doesn't result in anything
- (In another terminal) multipass exec ubuntu2004-1TB -- bash --login
- (Result) exec failed: ssh connection failed: 'Timeout connecting to '
Expected behavior The iter_fitgal.py code to be able to continue running. Within this script there are several other tasks (IRAF, SExtractor) that usually run seamlessly.
Logs
Ubuntu 20.04.4 LTS ubuntu2004-1TB ttyS0
ubuntu2004-1TB login: [ 28.376614] cloud-init[836]: Cloud-init v. 22.2-0ubuntu1~20.04.1 running 'modules:final' at Tue, 21 Jun 2022 14:22:23 +0000. Up 27.25 seconds. [ 28.382783] cloud-init[836]: Cloud-init v. 22.2-0ubuntu1~20.04.1 finished at Tue, 21 Jun 2022 14:22:25 +0000. Datasource DataSourceNoCloud [seed=/dev/sr0][dsmode=net]. Up 28.35 seconds
[ 1089.619072] INFO: task jbd2/vda1-8:280 blocked for more than 120 seconds. [ 1089.620004] Not tainted 5.4.0-117-generic #132-Ubuntu [ 1089.620394] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1089.620977] INFO: task kworker/u2:0:3283 blocked for more than 120 seconds. [ 1089.621461] Not tainted 5.4.0-117-generic #132-Ubuntu [ 1089.621849] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1089.622439] INFO: task cp:3315 blocked for more than 120 seconds. [ 1089.622890] Not tainted 5.4.0-117-generic #132-Ubuntu [ 1089.623302] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1210.552131] INFO: task jbd2/vda1-8:280 blocked for more than 241 seconds. [ 1210.552627] Not tainted 5.4.0-117-generic #132-Ubuntu [ 1210.553027] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1210.553624] INFO: task kworker/u2:0:3283 blocked for more than 241 seconds. [ 1210.554121] Not tainted 5.4.0-117-generic #132-Ubuntu [ 1210.554519] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1210.555127] INFO: task cp:3315 blocked for more than 241 seconds. [ 1210.555564] Not tainted 5.4.0-117-generic #132-Ubuntu [ 1210.555962] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1331.479144] INFO: task jbd2/vda1-8:280 blocked for more than 362 seconds. [ 1331.479629] Not tainted 5.4.0-117-generic #132-Ubuntu [ 1331.480027] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1331.480594] INFO: task systemd-journal:355 blocked for more than 120 seconds. [ 1331.481084] Not tainted 5.4.0-117-generic #132-Ubuntu [ 1331.481799] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1331.482455] INFO: task systemd-timesyn:543 blocked for more than 120 seconds. [ 1331.482970] Not tainted 5.4.0-117-generic #132-Ubuntu [ 1331.483375] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1331.483951] INFO: task rs:main Q:Reg:643 blocked for more than 120 seconds. [ 1331.484435] Not tainted 5.4.0-117-generic #132-Ubuntu [ 1331.484823] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Additional info
- OS: macOS 12.0.1
- 1.8.1
- info failed: ssh connection failed: 'Timeout connecting to '
Additional context
Hi @rpan04,
It's not clear to me if you are using an Intel-based Mac or M1-based Mac. Also, it looks like the kernel is getting stuck in a task. Could you allocate another core to your instance via passing -c 2 in the launch command?
Hi @townsend2010,
Apologies for not including this already. Computer Specs: Processor: 3.3 GHz 6-Core Intel Core i5 Memory 32 GB 2667 MHz DDR4 Storage 2 TB
Multipass VM Specs: Storage: 1 TB Allocated Memory: 16GB CPU Cores: 1 (default)
Just to be clear, you're suggesting that I purge this old instance and create a new instance with more CPU cores allocated.
Ok, so it's using Hyperkit by default and it definitely is getting old and I've been seeing some issues with it regarding the kernel. We now support qemu on all Macs except for those running macOS 10.14 which, of course, you are not.
You have a few options here.
- Switch to using the
qemudriver via$ sudo multipass set local.driver=qemu. This will not delete any existing Hyperkit instances, but will "hide" them. You will need tolauncha new instance and I would still suggest using at least 2 cores. - You can either create a new instance with at least 2 cores and try that. You can keep the old instance if you need it or if you don't,
purgeit to free up some resources.
FYI, there is a newer version if you'd like to try that.
Thank you for the suggestions and help. I will update to the new multipass, switch to the qemu driver and create a new instance since I am unable to access the old one.
Since the nature of the problem is unprovoked freezing, I won't know if the problem is fixed so I won't have anything to report back unless the error occurs again.
@rpan04,
Sure, we can leave this open for a while and close if you don't report back after some time. Thanks!
Closing this due to no more reports so assuming it's fixed after switching to QEMU.