genie icon indicating copy to clipboard operation
genie copied to clipboard

network lost after starting genie

Open Gerdya opened this issue 3 years ago • 9 comments

Windows version (build number): 19044.1466

Linux distribution: Ubuntu 20.04 via wsl --install

Genie version: 1.44

Describe the bug After starting genie -v -s after a while (this happens before the bottle is ready to enter) i am losing the network connection.

Confirm that you are running inside the bottle: Network connection is lost inside and outside the bottle.

To Reproduce

  1. Start wsl (networking is working)

~/projects/genie$ ifconfig eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 172.17.72.67 netmask 255.255.240.0 broadcast 172.17.79.255 inet6 fe80::215:5dff:feb7:2c72 prefixlen 64 scopeid 0x20 ether 00:15:5d:b7:2c:72 txqueuelen 1000 (Ethernet) RX packets 21 bytes 4563 (4.5 KB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 13 bytes 1006 (1.0 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

  1. Starting Genie with genie -v -s here. After I could enter the bottle (which is few minutes after):

~$ genie -b inside

~$ systemctl list-units --failed UNIT LOAD ACTIVE SUB DESCRIPTION ● systemd-networkd-wait-online.service loaded failed failed Wait for Network to be Configured ● systemd-remount-fs.service loaded failed failed Remount Root and Kernel File Systems ● [email protected] loaded failed failed User Runtime Directory /run/user/1000 ● multipathd.socket loaded failed failed multipathd control socket

LOAD = Reflects whether the unit definition was properly loaded. ACTIVE = The high-level unit activation state, i.e. generalization of SUB. SUB = The low-level unit activation state, values depend on unit type.

4 loaded units listed.

~$ ifconfig eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet6 fe80::215:5dff:feb7:2c72 prefixlen 64 scopeid 0x20 ether 00:15:5d:b7:2c:72 txqueuelen 1000 (Ethernet) RX packets 41 bytes 8883 (8.8 KB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 55 bytes 12983 (12.9 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

~$ networkctl IDX LINK TYPE OPERATIONAL SETUP 1 lo loopback carrier unmanaged 2 bond0 bond off unmanaged 3 dummy0 ether off unmanaged 4 tunl0 tunnel off unmanaged 5 sit0 sit off unmanaged 6 eth0 ether degraded configuring

Expected behavior network is working

Screenshots

Additional context Not sure if it makes a big difference but i am working on an aarch64 machine. (Surface Pro X). For this i had to modify some files

~/projects/genie$ git status On branch master Your branch is up to date with 'origin/master'.

Changes not staged for commit: (use "git add ..." to update what will be committed) (use "git restore ..." to discard changes in working directory) modified: binsrc/genie/Makefile modified: binsrc/genie/genie.csproj modified: binsrc/runinwsl/Makefile modified: binsrc/runinwsl/runinwsl.csproj modified: package/local/Makefile

I confirm that I have read the ENTIRE supplied readme file and checked for relevant information on the repository wiki before raising this issue, and that if the solution to this issue is found in either location, it will be closed without further comment:

  • [X] Yes.

Gerdya avatar Jan 31 '22 00:01 Gerdya

Looks like you've got systemd-networkd trying to reconfigure eth0 and failing, which will cause you network trouble since WSL already configures the network interfaces outside of systemd. Try commenting out the eth0 match in /etc/systemd/network/wired.network, i.e.:

[Match]
# Name=eth0

A couple of those other failing units have solutions in the Wiki. Try the eth0 thing and fixing those first, and then we'll see about the others.

cerebrate avatar Feb 03 '22 01:02 cerebrate

I'd also appreciate a copy of those modifications you made.

cerebrate avatar Feb 03 '22 01:02 cerebrate

Thanks for the feedback!

Modifications: replace "x64" with "arm64" in all below files, this will change the RID (runtime identifier): modified: binsrc/genie/Makefile modified: binsrc/genie/genie.csproj modified: binsrc/runinwsl/Makefile modified: binsrc/runinwsl/runinwsl.csproj modified: package/local/Makefile

In theory this already would have worked, except there is an architecture mismatch in the NuGet package: Linux.ProcessManager

So i did install the Linux.ProcessManager source project locally (via git clone) and added it to the genie.csproj:

<ProjectReference Include=../../../Linux.ProcessManager/ProcessManager/ProcessManager.csproj" />

Genie is now happily building and installing.

regarding /etc/systemd/network/wired.network

The folder was originally empty. So my thought was, this might be the problem - hence I did create a file /etc/systemd/network/20-wired.network with following content:

[Match]
Name=eth0

[Network]
DHCP=yes

Still did not work. So i took your advice and changed it to:

[Match]
# Name=eth0

[Network]
DHCP=yes

Alas, still no network available :(

Would it be a solution to just disable systemd-networkd.service?

Gerdya avatar Feb 04 '22 20:02 Gerdya

If you're not using bridged networking, you want to disable systemd-networkd management of the interface as shown here:

https://github.com/arkane-systems/genie/wiki/Systemd-units-known-to-be-problematic-under-WSL#systemd-networkdservice

WSL internal networking doesn't support DHCP; it just inherits an already existing network configuration, so letting systemd-networkd try to manage it will always fail.

cerebrate avatar May 09 '22 04:05 cerebrate

I am also having issues with networking after starting genie. I followed the steps to ensure that eth0 would be unmanaged and also disabled all systemd network services.. however, I am still left with no networking on the eth0 device.

wsl:~$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: bond0: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,DYNAMIC,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether a2:1b:83:ef:b2:15 brd ff:ff:ff:ff:ff:ff
3: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether f2:dd:84:93:c5:3e brd ff:ff:ff:ff:ff:ff
4: eth0: <BROADCAST,MULTICAST,DYNAMIC,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:15:5d:a8:29:fc brd ff:ff:ff:ff:ff:ff
    inet6 fe80::215:5dff:fea8:29fc/64 scope link
       valid_lft forever preferred_lft forever
5: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0
6: sit0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
    link/sit 0.0.0.0 brd 0.0.0.0

The IPv4 setting disappears and returns occasionally with a link-local IP address

EDIT:

As an update, it seems that ALL of my WSL2 machines are not able to get an IP address after starting genie in just one of them

noorez avatar Jul 06 '22 20:07 noorez

After masking the system-udevd.service, I was able to have a fully functioning network... I don't know much how to control this service (I couldn't find how to configure via a .link file how to ignore network configurations altogether) but disabling this service helped...

noorez avatar Aug 11 '22 20:08 noorez

@noorez Interesting. I wasn't aware of potential problems stemming from systemd-udevd (indeed, up until fairly recently I wasn't all that aware of its role in network interface configuration).

I want to go ahead and document this on the wiki for others who may have your problem. Could you let me have the contents of the files in your /etc/systemd/network directory? I suspect there must be some difference there, inasmuch as I'm running with the default configuration per the systemd-udevd man page and systemd-udevd causes no issues on my system.

cerebrate avatar Aug 12 '22 17:08 cerebrate

@cerebrate .. so this actually only seemed to work on my Ubuntu 20.04 WSL machine... it didn't work at all on my Debian one: in fact, starting genie on the Debian machine actually ended up killing the working network configuration on the already working Ubuntu one (as it was doing before). Is systemd somehow killing the Hyper-V WSL virtual switch?. I don't understand systemd system well enough to see how/why it was doing it... but masking that systemd-udevd service seemed to work for me at least on Ubuntu ( it wasn't enough to simply set the interface to unmanaged)?

the directories on both systems were empty.

noorez avatar Aug 12 '22 19:08 noorez

I am unsure as to what the problem might be; on my Debian distro, I can't reproduce it, unfortunately.

For what it's worth, it seems odd that the folder is empty; the /etc/systemd/network folder should by default on Debian contain a 99-default.link file, containing the following:

[Link]
NamePolicy=kernel database onboard slot path
MACAddressPolicy=persistent

Which gives the error

Aug 10 23:17:32 pallas-wsl systemd-udevd[147]: /etc/systemd/network/99-default.link: No valid settings found in the [Match] section, ignoring file. To match all interfaces, add OriginalName=* in the [Match] section.

when I do a systemctl status systemd-udevd (as I believe it's supposed to, since it's just providing defaults for systemd-networkd), and things otherwise proceed fine. Maybe you can get something out of the systemd-udevd log in the same way?

I will say that it's not supported - can't be, really, unfortunately - to use genie/systemd in multiple distros on the same machine at the same time. The problem is that all the distros share a single kernel, and while some things (pids, mounts, etc.) are isolated between distros, other things - including network interfaces - aren't; so if you have multiple systemd instances running in different distros, they're both going to try and manage things without being aware of the other systemd trying to do the same, and that's pretty much guaranteed to result in issues.

cerebrate avatar Aug 12 '22 21:08 cerebrate

Insane major fail on my part for not noticing or realizing :(

I had the lxqt, xfce packages installed. I needed to get NetworkManager/connman to ignore the network interfaces since the configuration file provided for systemd-networkd to ignore eth0 didn't work or wasn't being respected by NetworkManager/connman. There was no need (it seems) to mask out udevd service.

noorez avatar Aug 18 '22 17:08 noorez