UTM icon indicating copy to clipboard operation
UTM copied to clipboard

Ubuntu Server 20.04 VM networking stops working intermittently

Open tallytarik opened this issue 3 years ago • 26 comments

Describe the issue I'm running a Ubuntu Server 20.04 VM according to the setup guide.

Every so often, the VM networking stops working while it is running. The VM can no longer access the internet, and I can no longer SSH from the host to the VM. The VM console window still works, and I'm able to log in and use the VM that way. I can shut down and restart the VM, and networking works again.

I've noticed that when networking stops working, the CPU usage for QEMULauncher sits at 100%. Nothing inside the VM (checked with htop) is using this much CPU.

It happens randomly - I can't reproduce it on demand. I've been running this VM daily for a couple of weeks, and I've seen this issue ~5 times. Once it happened twice (after a restart) within about 5 minutes.

Configuration

  • UTM Version: 2.4.1
  • OS Version: macOS Monterey 12.0.1
  • Intel or Apple Silicon? Apple (M1 Max)
  • Shared networking

Crash log N/A

Debug log Will add ASAP - sorry, I enabled debug logging earlier, but have since restarted the VM. I'll wait for the issue to happen again and attach the debug log.

Upload VM config.plist.txt

tallytarik avatar Dec 06 '21 20:12 tallytarik

The CPU usage might be unrelated. I've just had the CPU issue again - where QEMULauncher is at a minimum of 100% - but the VM networking is still working fine.

tallytarik avatar Dec 07 '21 02:12 tallytarik

same problem. port forwarding stop working, cannot SSH to the VM from the host

Configuration

UTM Version: 2.4.1 OS Version: macOS Monterey 12.0.1 Intel or Apple Silicon? Apple (M1 Pro) Shared networking

prabhah avatar Dec 07 '21 06:12 prabhah

I've had it happen again just now.

Turns out the debug log is not particularly exciting

Running:  -L /Applications/UTM.app/Contents/Resources/qemu -S -qmp tcp:127.0.0.1:4000,server,nowait -nodefaults -vga none -spice "unix=on,addr=/Users/tallytarik/Library/Group Containers/WDNLXAD4W8.com.utmapp.UTM/257404A5-9A02-474C-AD00-CF75ADFF1F1E.spice,disable-ticketing=on,image-compression=off,playback-compression=off,streaming-video=off,gl=on" -device virtio-ramfb-gl -cpu cortex-a72 -smp cpus=8,sockets=1,cores=8,threads=1 -machine virt,highmem=off -accel hvf -accel tcg,tb-size=1500 -drive if=pflash,format=raw,unit=0,file=/Applications/UTM.app/Contents/Resources/qemu/edk2-aarch64-code.fd,readonly=on -drive if=pflash,format=raw,unit=1,file=/Users/tallytarik/Library/Containers/com.utmapp.UTM/Data/Documents/DockerUbuntu.utm/Images/efi_vars.fd -boot menu=on -m 6000 -device intel-hda -device hda-duplex -name DockerUbuntu -device qemu-xhci,id=usb-bus -device usb-tablet,bus=usb-bus.0 -device usb-mouse,bus=usb-bus.0 -device usb-kbd,bus=usb-bus.0 -device ich9-usb-ehci1,id=usb-controller-0 -device ich9-usb-uhci1,masterbus=usb-controller-0.0,firstport=0,multifunction=on -device ich9-usb-uhci2,masterbus=usb-controller-0.0,firstport=2,multifunction=on -device ich9-usb-uhci3,masterbus=usb-controller-0.0,firstport=4,multifunction=on -chardev spicevmc,name=usbredir,id=usbredirchardev0 -device usb-redir,chardev=usbredirchardev0,id=usbredirdev0,bus=usb-controller-0.0 -chardev spicevmc,name=usbredir,id=usbredirchardev1 -device usb-redir,chardev=usbredirchardev1,id=usbredirdev1,bus=usb-controller-0.0 -chardev spicevmc,name=usbredir,id=usbredirchardev2 -device usb-redir,chardev=usbredirchardev2,id=usbredirdev2,bus=usb-controller-0.0 -device virtio-blk-pci,drive=drive0,bootindex=0 -drive if=none,media=disk,id=drive0,file=/Users/tallytarik/Library/Containers/com.utmapp.UTM/Data/Documents/DockerUbuntu.utm/Images/disk-0.qcow2,cache=writethrough -device usb-storage,drive=drive1,removable=true,bootindex=1,bus=usb-bus.0 -drive if=none,media=cdrom,id=drive1 -device virtio-net-pci,mac=E6:84:EB:2B:78:64,netdev=net0 -netdev vmnet-macos,mode=shared,id=net0 -device virtio-serial -device virtserialport,chardev=vdagent,name=com.redhat.spice.0 -chardev spicevmc,id=vdagent,debug=0,name=vdagent -uuid 257404A5-9A02-474C-AD00-CF75ADFF1F1E -rtc base=localtime
qemu-aarch64-softmmu: -netdev vmnet-macos,mode=shared,id=net0: info: Started vmnet interface with configuration:
qemu-aarch64-softmmu: -netdev vmnet-macos,mode=shared,id=net0: info: MTU:              1500
qemu-aarch64-softmmu: -netdev vmnet-macos,mode=shared,id=net0: info: Max packet size:  1514
qemu-aarch64-softmmu: -netdev vmnet-macos,mode=shared,id=net0: info: MAC:              c2:af:bc:c3:cf:9e
qemu-aarch64-softmmu: -netdev vmnet-macos,mode=shared,id=net0: info: DHCP IPv4 start:  192.168.64.1
qemu-aarch64-softmmu: -netdev vmnet-macos,mode=shared,id=net0: info: DHCP IPv4 end:    192.168.64.254
qemu-aarch64-softmmu: -netdev vmnet-macos,mode=shared,id=net0: info: IPv4 subnet mask: 255.255.255.0
qemu-aarch64-softmmu: -netdev vmnet-macos,mode=shared,id=net0: info: UUID:             216D6223-C430-48C2-B471-9E298EB4A802
qemu-aarch64-softmmu: warning: Spice: playback:0 (0x14486e920): setsockopt failed, Operation not supported on socket
qemu-aarch64-softmmu: warning: Spice: record:0 (0x14486e9d0): setsockopt failed, Operation not supported on socket
gl_version 30 - es profile enabled
WARNING: running without ARB/KHR robustness in place may crash

tallytarik avatar Dec 07 '21 22:12 tallytarik

I was able to restore networking by running this in the console window:

ip link set dev enp0s9 down
ip link set dev enp0s9 up

tallytarik avatar Dec 07 '21 22:12 tallytarik

Just upgraded to Monterey 12.1 and ran into the same issue. Headless (console) Linux VM Guest.

By the look of things it's initially getting APIPA / ULA addresses:

2: enp0s6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP g0
    link/ether 9a:66:8c:d6:98:4e brd ff:ff:ff:ff:ff:ff
    inet 169.254.234.93/16 brd 169.254.255.255 scope global noprefixroute enp0s6
       valid_lft forever preferred_lft forever
    inet6 fd14:97e6:d2a3:5250:16b1:dd1:b42e:cd54/64 scope global temporary dyna 
       valid_lft 604780sec preferred_lft 86163sec
    inet6 fd14:97e6:d2a3:5250:9866:8cff:fed6:984e/64 scope global dynamic mngtm 
       valid_lft 2591980sec preferred_lft 604780sec
    inet6 fe80::9866:8cff:fed6:984e/64 scope link 
       valid_lft forever preferred_lft forever

~~DHCP bug maybe?~~ Seems to be dependent on firewall. Previously on Big Sur I was running "Drop all incoming connections", later went down to turning on stealth mode, neither seemed to interrupt UTM. Now it appears that having the firewall enabled at all (even setting UTM.app and QEMULauncher.app to "Allow incoming connections" doesn't help) seems to break DHCP.

Setting the IP manually (to what DHCP would normally provide) seems to work, although that might be coincidental;

ip link set enp0s6 down
systemctl stop dhcpcd
ip link set enp0s6 up
ip addr add 192.168.64.4/24 dev enp0s6 
ip ro add default via 192.168.64.1

Relatively tame firewall settings that feel like they shouldn't be causing issue, perhaps something up with the vmnet-mac qemu driver? image

polynomialspace avatar Dec 14 '21 14:12 polynomialspace

@polynomialspace Thanks for doing some extra digging!

After I read your comment I tried disabling the firewall, but I just saw the VM networking die again - with firewall off. So maybe it's not a factor?

tallytarik avatar Dec 15 '21 02:12 tallytarik

Got the issue with a ubuntu 20.04.3 guest (x64) on macos 12.1 (21C52) (Apple Silicon).

thisisthekap avatar Dec 23 '21 12:12 thisisthekap

I've been encountering the same issue, and the fix by @tallytarik works to fix it at least temporarily.

ip link set dev enp0s9 down
ip link set dev enp0s9 up

Pratyush avatar Jan 04 '22 20:01 Pratyush

Here to add a "me too" on Apple M1 host and arm64 Ubuntu guest.

Networking simply stops working, suddenly and without any repeatable causal pattern, as far as I can discern.

I cannot SSH into the guest, nor can I reach it from any sources that are external to the VM. The window manager still works, and I can either restart the interface as outlined above or reboot the VM to workaround until it strikes again.

I do not have the symptom of "uses 100% CPU" during those times. Other than network connections failing, the system seems to be running as expected.

ml-costmo avatar Feb 07 '22 21:02 ml-costmo

For me it occurred when I had call on the host OS using Microsoft teams, don’t know if coincidence, having a workaround will help, thanks!

sokurenko avatar Feb 21 '22 00:02 sokurenko

I've been seeing this too. I originally thought that it was happening around network changes, because I would find the VM networking dead after the wifi disconnects (due to the issue with 80MHz channel width). It has been better since I addressed that a few days ago, but it's happened twice today without an associated loss of wifi connectivity. I checked for it around other network transitions (connecting and disconnecting a VPN), but it seemed to still be working. I then found it dead again about 30 minutes later.

agaffney avatar Feb 22 '22 22:02 agaffney

My Ubuntu VM network just dropped again in the middle of using it, but the ip link set dev <device> down/up command suggested above worked to bring it back.

agaffney avatar Feb 23 '22 16:02 agaffney

Hi all,

This issue is vey easy to reproduce on latest opensuse leap + sharing a big file (over 10 GB) via ssh. It is freezing every time. I can provide logs if necessary.

Wysłane z iPhone'a

Wiadomość napisana przez Andrew Gaffney @.***> w dniu 23.02.2022, o godz. 17:13:

 My Ubuntu VM network just dropped again in the middle of using it, but the ip link set dev down/up command suggested above worked to bring it back.

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you are subscribed to this thread.

pawelwiejkut avatar Feb 23 '22 16:02 pawelwiejkut

Something similar seems to happen when using Lima, which also uses QEMU.

agaffney avatar Feb 24 '22 15:02 agaffney

ip link set dev enp0s9 down

I keep getting zsh: command not found: ip error. Any help here??

mcfriend99 avatar Mar 09 '22 14:03 mcfriend99

That sounds like you don't have the iproute2 package installed. It should be a pretty standard part of most Linux distributions these days. There should also be an equivalent command in the older ifconfig utility.

agaffney avatar Mar 09 '22 14:03 agaffney

Oh... I was running it on the host machine. The commands ran successfully on the Ubuntu VM, but the network problem persists. Any help here??

mcfriend99 avatar Mar 09 '22 14:03 mcfriend99

Same issue for me as well. I have noticed it with the pre-built ununtu 20.04 image from the gallery. All networking services become randomly unavailable.

ip link set dev enp0s9 down ip link set dev enp0s9 up

This solution is the only one I have found so far.

armen-y avatar Apr 06 '22 17:04 armen-y

Same issue here, sometimes restart the VM helps to reconnect to it and sometimes not, need to restart completely the mac. I will try the command to down up the network link to see if it helps.

QuentinGuyot-epitech avatar Jul 01 '22 05:07 QuentinGuyot-epitech

I'm also getting this, Ubuntu 22.10 guest, macOS 13.3.1 host on Intel, running on the Apple Virtualisation backend (not qemu). Toggling the network interface on/off from the gnome shell top right menu also works to restore connectivity, at which point VPN etc. must be reconnected.

Similar to a previous commenter, I am using Microsoft Teams on the host mac and it might be correlated. I also have a USB-C ethernet interface and subjectively seem to experience the networking failures more when both that and wifi are plugged in and enabled, although I get it when just using wifi as well.

adeadman avatar May 03 '23 15:05 adeadman

I had this same issue and it was driving me nuts. I finally decided to try changing the Emulated Network Card from 'virtio-net-pci' to 'virtio-net-device'. It's been a week now without any loss of network.

Now if I can just get GL display drivers to not randomly lock up, this work mac can be a Linux desktop all the time. Sooo close I can taste it.

thedarb avatar May 08 '23 01:05 thedarb

This works for me and needs to be done after every reboot: dhclient enp0s1

sokurenko avatar May 08 '23 07:05 sokurenko

I started getting this issue within the last two months (which is weird as this is a very old issue). On both Ubuntu and RHEL the network completely dies after (usually hours or days) but now more frequently. Sometimes minutes or an hour.

I can solve this by setting the Network device to net-virtio-device, but you can only do this on one VM at a time, so only one of my VMs avoids this issue.

dabaer avatar Aug 15 '23 15:08 dabaer

I ended up creating this script that periodically checks if the internet is up, and if not, automatically restarts the interface, based on the solution in this issue.

#!/bin/bash

if [ $(whoami) != root ]
then
        echo This command must be run as root >&2
        exit 1
fi

while true
do
        now=$(date +'%Y-%m-%dT%H:%M:%S')
        if curl --silent --show-error --max-time 5 https://cloudflare-ipfs.com/ipfs/bafkreihdwdcefgh4dqkjv67uzcmw7ojee6xedzdetojuzjevtenxquvyku
        then
                echo "$now - Internet is OK"
        else
                echo "$now - Internet is not OK"
                ip link set enp0s1 down
                ip link set enp0s1 up
        fi
        sleep 15
done

Notes:

  • This URL points to a blank file.
  • You may need to change enp0s1 to the name of the interface you're using. To check the name run ip link

dtinth avatar Aug 20 '23 22:08 dtinth

This is happening for me too, although the solution to restart networking with sudo ip link set enp0s1 down / sudo ip link set enp0s1 up does not resolve it. Only a full shut-down and restart seems to fix it.

Host environment: Mac 14.2.1 (23C71) (Intel) Guest environment: Ubuntu 20.04.2 with UI UTM version: Version 4.4.5 (94)

I've also tried disabling/enabling networking through the guest Ubuntu UI, but that doesn't work either.

antun avatar Jan 09 '24 19:01 antun

Same here, after running the latest updates on Ubuntu 22.04.03 LTS.
Intermittent loss of network. Updated the UTM app and now using the script @dtinth provided.

The strange thing is, this is a VM that has been running over 6 months, and just this week - after the updates, it started falling apart.

chrisvanmeer avatar Jan 31 '24 07:01 chrisvanmeer

I've had a similar issue, except occurring around once a day, figured out that it's related to Netplan not applying the default route when updating the DHCP lease. More info on this can be found here, replacing Netplan with NetworkManager solved this for me.

illixion avatar Feb 20 '24 07:02 illixion

@illixion hmm mine was already set to NetworkManager.
The script to keep the connection alive works for me as a workaround.

chrisvanmeer avatar Feb 20 '24 07:02 chrisvanmeer