Support attaching instances to a "physical" network that's using a bridge as parent
The host system provides a bridge interface that is connected to a physical interface.
When integrating that bridge via the physical nic type and passing the bridge interface as the parentI get the following error when starting the VM.
Failed to start device "eth-1": Failed to get PCI device info for "br-lan": open /sys/class/net/br-lan/device/uevent: no such file or directory
This is because the bridge does not have a device directory, but the code requires it for looking up PCIe information.
https://github.com/lxc/incus/blob/36701b95b6c1d83ce6d71798670988d6cf580cdf/internal/server/device/nic_physical.go#L211
The documentation claims I can pass a bridge:
The physical network type connects to an existing physical network, which can be a network interface or a bridge, and serves as an uplink network for OVN.
https://linuxcontainers.org/incus/docs/main/reference/network_physical/
This issue is reproducible on incus 6.10.1.
and serves as an uplink network for OVN
You're not trying to use it as an uplink for an OVN network but directly as a physical NIC on an instance, so that's probably why this is failing. I very commonly use a "physical" network with a bridge as parent for OVN uplinks and that works just fine.
That said, I think it'd make sense to support attaching instances to such a "physical" type network, just using the normal bridge handling logic if we see it's a bridge.
I updated the issue title accordingly.
and serves as an uplink network for OVNYou're not trying to use it as an uplink for an OVN network but directly as a physical NIC on an instance, so that's probably why this is failing. I very commonly use a "physical" network with a bridge as parent for OVN uplinks and that works just fine.
I've been 'battling' with this very issue recently. Placing the ovn network behind a managed incus bridge essentially makes everything in the ovn network double NAT. Perhaps I don't fully understand the reasoning for why macvlan and physical networks can not be parents to the ovn bridge.
Why must I create a managed bridge with NAT enabled to be the parent of the ovn network?
You don't. Most production deployments use a physical managed network as the uplink for OVN.
That said, OVN cannot use macvlan for that, but it can use a physical network interface or a VLAN. The main restriction is that this interface or VLAN must be unconfigured on the host, so no IP config on it at all as when OVN will consume it, it will no longer be usable by the host system.
Anyway, that's unrelated to this issue.
You don't. Most production deployments use a physical managed network as the uplink for OVN.
I understand, and i won't add any more to this issue as I do not want to hijack the thread, however even sr-iov interfaces can not be used. Thanks for your answers and great project!
Hey I am a UT student, I would like to work on this issue. I am working with a partner who will also comment on this.
Assigned it to you!
Hey, its the partner here
@stgraber I would like to ask you for a recommendation on how to best approach understanding this issue. And also, how could I reproduce the issue?
It should be quite easy to reproduce the issue, something like this:
-
sudo ip link add dev br-test type bridge -
sudo ip link set dev br-test up -
incus network create br-test --type=physical parent=br-test -
incus launch images:debian/13 c1 --network br-test
The logic to be modified should all be in internal/server/device/nic_physical.go.
Basically, the code needs to detect that the parent property points to a bridge, this can be checked with a call to util.PathExists(fmt.Sprintf("/sys/class/net/%s/bridge", name)).
If it is a bridge, then bridge attach logic from nic_bridged.go should be followed.
The aim would be to attach a host in a bridge created by the system. Unfortunately, this doesn't work via the CLI or the WebGUI.
If you edit the Host-YAML configuration directly, this is already possible.
devices:
lan:
name: lan
nictype: bridged
parent: br-test
type: nic
Perhaps this information will help to narrow down the relevant position
The aim would be to attach a host in a bridge created by the system. Unfortunately, this doesn't work via the CLI or the WebGUI.
If you edit the Host-YAML configuration directly, this is already possible.
devices: lan: name: lan nictype: bridged parent: br-test type: nicPerhaps this information will help to narrow down the relevant position
@tomy42 unfortunately I have been unable to recreate your solution by configuring the instance YAML , I still run into issues when I try editing the configuration of the instance.
Config parsing error: Invalid devices: Device validation failed for "lan": Specified network must be of type bridge
Could elaborate on the specific steps you did to get it to work?
I'd recommend focusing on what I mentioned in my earlier comment: https://github.com/lxc/incus/issues/1735#issuecomment-2799170436
As this shows an easy way to reproduce it without having to mess with system-wide OS configuration and also mentions exactly what files need to get modified to make this behave.
@stgraber
We ended up getting this in our launch log, is this the expected behavior?
`Log: lxc c1 20250501010053.348 ERROR network - ../src/lxc/network.c:lxc_network_move_created_netdev_priv:3549 - Invalid argument - Failed to move network device "br-test" with ifindex 5 to network namespace 2653 and rename to physkMrBSW
lxc c1 20250501010053.348 ERROR start - ../src/lxc/start.c:lxc_spawn:1840 - Failed to create the network
lxc c1 20250501010053.353 ERROR lxccontainer - ../src/lxc/lxccontainer.c:wait_on_daemonized_start:878 - Received container state "ABORTING" instead of "RUNNING"
lxc c1 20250501010053.353 ERROR start - ../src/lxc/start.c:__lxc_start:2107 - Failed to spawn container "c1"
lxc c1 20250501010053.353 WARN start - ../src/lxc/start.c:lxc_abort:1036 - No such process - Failed to send SIGKILL via pidfd 17 for process 2653
lxc 20250501010053.414 ERROR af_unix - ../src/lxc/af_unix.c:lxc_abstract_unix_recv_fds_iov:218 - Connection reset by peer - Failed to receive response
lxc 20250501010053.414 ERROR commands - ../src/lxc/commands.c:lxc_cmd_rsp_recv_fds:128 - Failed to receive file descriptors for command "get_init_pid"`
Yep, that's the current expected failure.
Basically Incus tries to move the entire bridge into the container rather than attach the container to the bridge.
Here is our proposed approach for bridge attaching in Start() after reviewing nic_bridged.go :
- check parent property is bridge using
util.PathExists(fmt.Sprintf("/sys/class/net/%s/bridge", name) - call
network.AttachInterface() - add a lambda to
reverterthat'll will callnetwork.DetachInterface()~~4. Check bridge type and setup VLAN settings on port using similar logic likesetupNativeBridgePortVLANs()andsetupOVSBridgePortVLANs()~~ ~~5. check if hairpin mode needs to be enabled, following logic found innic_bridged.go~~
Then in postStop() we will do something akin to nic_bridged.go and:
~~1. Check if the parent was a bridge using util.PathExists(fmt.Sprintf("/sys/class/net/%s/bridge", name)~~
~~2. Check if host device interface still exists and host device configuration is not null~~
3. detach interface from bridge
4. remove host interface
This is assuming a few things namely:
- We offering both Native and OVS support
- Allow for VETH pairs
EDIT: There are flaws with the approach given the nature of physical NICs which don't use VETH connections and dont manage VLAN configurations (that's the bridge's responsibility)
So far have a tentative implementation with changes only to Start() and Stop() in nic_physical.go , I was unsure whether I should modify validateEnvironment() or validateConfig().
However after building and trying to run my modified version of incus , I still run into the same error trying to create the container instance.
I am currently trying to get Debug print statements using logger.Debug() , however they aren't showing up when I rebuild with make debug and set the environment variable INCUS_DEBUG to 1 like the doc says. Is there anything more I should be doing so that my debug statements get printed out?
Try running incus monitor --pretty this will show you all the log messages coming out of Incus.
Try running
incus monitor --prettythis will show you all the log messages coming out of Incus.
I am trying to debug the nic_physical.go file by adding debug statements using the logger (e.g., d.logger.Debug(), but I am not seeing any output through either using incus monitor nor using the --debug flag when calling launch. How can I determine whether this is an issue with the logger not being properly initialized or if methods like Start() are not being called at all? What steps can I take to verify the execution flow and ensure my debug statements are working?
EDIT: I have found out that the development build that gets compiled doesn't reflect changes I make to the device file.
Greeting.. May I be a Contributor here? I'm not familiar with go language and incus. However, I am willing to think of it as a computer networker.
To connect L2 bridge and incus bridge, we use profile. In this time, I can't use this way.
- incus version : 6.0.0
I tried incus monitor --pretty and incus launch images:debian/13 c1 --network br-test.
I think you are waiting for it.
DEBUG [2025-05-04T14:15:49Z] Handling API request ip=@ method=GET protocol=unix url=/1.0 username=hooni
DEBUG [2025-05-04T14:15:49Z] Handling API request ip=@ method=GET protocol=unix url=/1.0/networks/br-test username=hooni
DEBUG [2025-05-04T14:15:49Z] Handling API request ip=@ method=GET protocol=unix url=/1.0/events username=hooni
DEBUG [2025-05-04T14:15:49Z] Event listener server handler started id=1cf8e9fa-a313-4dc6-bbe3-bdde525ed233 local=/var/lib/incus/unix.socket remote=@
DEBUG [2025-05-04T14:15:49Z] Handling API request ip=@ method=POST protocol=unix url=/1.0/instances username=hooni
DEBUG [2025-05-04T14:15:49Z] Responding to instance create
DEBUG [2025-05-04T14:15:49Z] New operation class=task description="Creating instance" operation=a10c227e-95af-44e3-b6b1-759cb25146cf project=default
DEBUG [2025-05-04T14:15:49Z] Started operation class=task description="Creating instance" operation=a10c227e-95af-44e3-b6b1-759cb25146cf project=default
INFO [2025-05-04T14:15:49Z] ID: a10c227e-95af-44e3-b6b1-759cb25146cf, Class: task, Description: Creating instance CreatedAt="2025-05-04 14:15:49.288508632 +0000 UTC" Err= Location=none MayCancel=false Metadata="map[]" Resources="map[containers:[/1.0/instances/c1] instances:[/1.0/instances/c1]]" Status=Pending StatusCode=Pending UpdatedAt="2025-05-04 14:15:49.288508632 +0000 UTC"
INFO [2025-05-04T14:15:49Z] ID: a10c227e-95af-44e3-b6b1-759cb25146cf, Class: task, Description: Creating instance CreatedAt="2025-05-04 14:15:49.288508632 +0000 UTC" Err= Location=none MayCancel=false Metadata="map[]" Resources="map[containers:[/1.0/instances/c1] instances:[/1.0/instances/c1]]" Status=Running StatusCode=Running UpdatedAt="2025-05-04 14:15:49.288508632 +0000 UTC"
DEBUG [2025-05-04T14:15:49Z] Handling API request ip=@ method=GET protocol=unix url=/1.0/operations/a10c227e-95af-44e3-b6b1-759cb25146cf username=hooni
DEBUG [2025-05-04T14:15:49Z] Connecting to a remote simplestreams server URL="https://images.linuxcontainers.org"
DEBUG [2025-05-04T14:15:49Z] Acquiring lock for image fingerprint=c4f17b293ea6413a120b169de518c2a75c72c311281cc45ddabbcc8500be4c2e
DEBUG [2025-05-04T14:15:49Z] Lock acquired for image fingerprint=c4f17b293ea6413a120b169de518c2a75c72c311281cc45ddabbcc8500be4c2e
DEBUG [2025-05-04T14:15:49Z] Image already exists in the DB fingerprint=c4f17b293ea6413a120b169de518c2a75c72c311281cc45ddabbcc8500be4c2e
DEBUG [2025-05-04T14:15:49Z] Instance operation lock created action=create instance=c1 project=default reusable=false
INFO [2025-05-04T14:15:49Z] Creating instance ephemeral=false instance=c1 instanceType=container project=default
DEBUG [2025-05-04T14:15:49Z] Adding device device=eth0 instance=c1 instanceType=container project=default type=nic
INFO [2025-05-04T14:15:49Z] Action: instance-created, Source: /1.0/instances/c1 location=none storage-pool=default type=container
DEBUG [2025-05-04T14:15:49Z] Adding device device=root instance=c1 instanceType=container project=default type=disk
INFO [2025-05-04T14:15:49Z] Created instance ephemeral=false instance=c1 instanceType=container project=default
DEBUG [2025-05-04T14:15:49Z] CreateInstanceFromImage started driver=btrfs instance=c1 pool=default project=default
DEBUG [2025-05-04T14:15:49Z] EnsureImage started driver=btrfs fingerprint=c4f17b293ea6413a120b169de518c2a75c72c311281cc45ddabbcc8500be4c2e pool=default
DEBUG [2025-05-04T14:15:49Z] Setting image volume size driver=btrfs fingerprint=c4f17b293ea6413a120b169de518c2a75c72c311281cc45ddabbcc8500be4c2e pool=default size=
DEBUG [2025-05-04T14:15:49Z] Checking image volume size driver=btrfs fingerprint=c4f17b293ea6413a120b169de518c2a75c72c311281cc45ddabbcc8500be4c2e pool=default
DEBUG [2025-05-04T14:15:49Z] EnsureImage finished driver=btrfs fingerprint=c4f17b293ea6413a120b169de518c2a75c72c311281cc45ddabbcc8500be4c2e pool=default
DEBUG [2025-05-04T14:15:49Z] Set new volume size driver=btrfs instance=c1 pool=default project=default size=
DEBUG [2025-05-04T14:15:49Z] Checking volume size driver=btrfs instance=c1 pool=default project=default
DEBUG [2025-05-04T14:15:49Z] CreateInstanceFromImage finished driver=btrfs instance=c1 pool=default project=default
DEBUG [2025-05-04T14:15:49Z] UpdateInstanceBackupFile started driver=btrfs instance=c1 pool=default project=default
DEBUG [2025-05-04T14:15:49Z] Instance operation lock finished action=create err="<nil>" instance=c1 project=default reusable=false
DEBUG [2025-05-04T14:15:49Z] UpdateInstanceBackupFile finished driver=btrfs instance=c1 pool=default project=default
DEBUG [2025-05-04T14:15:49Z] Start started instance=c1 instanceType=container project=default stateful=false
DEBUG [2025-05-04T14:15:49Z] Instance operation lock created action=start instance=c1 project=default reusable=false
INFO [2025-05-04T14:15:49Z] Starting instance action=start created="2025-05-04 14:15:49.362772915 +0000 UTC" ephemeral=false instance=c1 instanceType=container project=default stateful=false used="1970-01-01 00:00:00 +0000 UTC"
DEBUG [2025-05-04T14:15:49Z] MountInstance started driver=btrfs instance=c1 pool=default project=default
DEBUG [2025-05-04T14:15:49Z] MountInstance finished driver=btrfs instance=c1 pool=default project=default
DEBUG [2025-05-04T14:15:49Z] Starting device device=eth0 instance=c1 instanceType=container project=default type=nic
DEBUG [2025-05-04T14:15:49Z] Starting device device=root instance=c1 instanceType=container project=default type=disk
DEBUG [2025-05-04T14:15:49Z] UpdateInstanceBackupFile started driver=btrfs instance=c1 pool=default project=default
DEBUG [2025-05-04T14:15:49Z] UpdateInstanceBackupFile finished driver=btrfs instance=c1 pool=default project=default
DEBUG [2025-05-04T14:15:49Z] Skipping unmount as in use driver=btrfs pool=default refCount=1 volName=c1
DEBUG [2025-05-04T14:15:49Z] Handling API request ip=@ method=GET protocol=unix url="/internal/containers/c1/onstart?project=default" username=root
DEBUG [2025-05-04T14:15:49Z] Scheduler: container c1 started: re-balancing
ERROR [2025-05-04T14:15:49Z] Failed starting instance action=start created="2025-05-04 14:15:49.362772915 +0000 UTC" ephemeral=false instance=c1 instanceType=container project=default stateful=false used="1970-01-01 00:00:00 +0000 UTC"
DEBUG [2025-05-04T14:15:49Z] Start finished instance=c1 instanceType=container project=default stateful=false
INFO [2025-05-04T14:15:49Z] ID: a10c227e-95af-44e3-b6b1-759cb25146cf, Class: task, Description: Creating instance CreatedAt="2025-05-04 14:15:49.288508632 +0000 UTC" Err="Failed to run: /usr/libexec/incus/incusd forkstart c1 /var/lib/incus/containers /run/incus/c1/lxc.conf: exit status 1" Location=none MayCancel=false Metadata="map[]" Resources="map[containers:[/1.0/instances/c1] instances:[/1.0/instances/c1]]" Status=Failure StatusCode=Failure UpdatedAt="2025-05-04 14:15:49.288508632 +0000 UTC"
DEBUG [2025-05-04T14:15:49Z] Failure for operation class=task description="Creating instance" err="Failed to run: /usr/libexec/incus/incusd forkstart c1 /var/lib/incus/containers /run/incus/c1/lxc.conf: exit status 1" operation=a10c227e-95af-44e3-b6b1-759cb25146cf project=default
DEBUG [2025-05-04T14:15:49Z] Instance operation lock finished action=start err="Failed to run: /usr/libexec/incus/incusd forkstart c1 /var/lib/incus/containers /run/incus/c1/lxc.conf: exit status 1" instance=c1 project=default reusable=false
DEBUG [2025-05-04T14:15:49Z] Event listener server handler stopped listener=1cf8e9fa-a313-4dc6-bbe3-bdde525ed233 local=/var/lib/incus/unix.socket remote=@
DEBUG [2025-05-04T14:15:49Z] Handling API request ip=@ method=GET protocol=unix url="/internal/containers/c1/onstopns?netns=%2Fproc%2F8485%2Ffd%2F4&project=default&target=stop" username=root
DEBUG [2025-05-04T14:15:49Z] Instance operation lock created action=stop instance=c1 project=default reusable=false
DEBUG [2025-05-04T14:15:49Z] Instance initiated stop action=stop instance=c1 instanceType=container project=default
DEBUG [2025-05-04T14:15:49Z] Stopping device device=eth0 instance=c1 instanceType=container project=default type=nic
DEBUG [2025-05-04T14:15:50Z] Handling API request ip=@ method=GET protocol=unix url="/internal/containers/c1/onstop?project=default&target=stop" username=root
DEBUG [2025-05-04T14:15:50Z] Instance operation lock inherited for stop action=stop instance=c1 instanceType=container project=default
DEBUG [2025-05-04T14:15:50Z] Instance stopped, cleaning up instance=c1 instanceType=container project=default
DEBUG [2025-05-04T14:15:50Z] Stopping device device=root instance=c1 instanceType=container project=default type=disk
DEBUG [2025-05-04T14:15:50Z] UnmountInstance started driver=btrfs instance=c1 pool=default project=default
DEBUG [2025-05-04T14:15:50Z] UnmountInstance finished driver=btrfs instance=c1 pool=default project=default
INFO [2025-05-04T14:15:50Z] Shut down instance action=stop created="2025-05-04 14:15:49.362772915 +0000 UTC" ephemeral=false instance=c1 instanceType=container project=default stateful=false used="2025-05-04 14:15:49.623328974 +0000 UTC"
DEBUG [2025-05-04T14:15:50Z] Instance operation lock finished action=stop err="<nil>" instance=c1 project=default reusable=false
DEBUG [2025-05-04T14:15:50Z] Scheduler: container c1 stopped: re-balancing
INFO [2025-05-04T14:15:50Z] Action: instance-shutdown, Source: /1.0/instances/c1
The aim would be to attach a host in a bridge created by the system. Unfortunately, this doesn't work via the CLI or the WebGUI. If you edit the Host-YAML configuration directly, this is already possible.
devices: lan: name: lan nictype: bridged parent: br-test type: nicPerhaps this information will help to narrow down the relevant position
@tomy42 unfortunately I have been unable to recreate your solution by configuring the instance YAML , I still run into issues when I try editing the configuration of the instance.
Config parsing error: Invalid devices: Device validation failed for "lan": Specified network must be of type bridgeCould elaborate on the specific steps you did to get it to work?
My System Config:
Systemd Network NetDev Config:
# 12-br-lan.netdev
[NetDev]
Name=br-lan
Kind=bridge
and
Systemd Network Network Config:
# 12-br-lan.network
[Match]
Name=br-lan
[Network]
Description="LAN Network"
Address=192.168.1.10/24
Gateway=192.168.1.1
DNS=192.168.1.1
IPv6AcceptRA=yes
IPForward=no
And the Incus Profile what is assigned to the instance
name: LAN-Intern
description: Locals LAN
devices:
lan:
name: lan
nictype: bridged
parent: br-lan
type: nic
config: {}
project: default
Version 6.0 and 6.12 are working.
To create bridge : sudo ip link add dev br-test type bridge
To interface up : sudo ip link set dev br-test up
To show bridged : brctl show It need to install sudo apt install bridge-utils
To create incus bridge : incus network create br-test --type=bridge
- If the "--type" is physical, there is a spread-out error when editing the profile, so I changed to bridge.
- Do not create a br-test to connect the L2 bridge directly.
To edit default profile : incus profile edit default
Value
devices:
eth0:
name: eth0
nictype: bridged
parent: br-test
type: nic
To create instance : incus launch images:debian/13 c1
The image below is what I have done.
Try running
incus monitor --prettythis will show you all the log messages coming out of Incus.I am trying to debug the
nic_physical.gofile by adding debug statements using the logger (e.g.,d.logger.Debug(), but I am not seeing any output through either usingincus monitornor using the--debugflag when callinglaunch. How can I determine whether this is an issue with the logger not being properly initialized or if methods likeStart()are not being called at all? What steps can I take to verify the execution flow and ensure my debug statements are working?EDIT: I have found out that the development build that gets compiled doesn't reflect changes I make to the device file.
Hey there,
So you ran make and got a new incusd in ~/go/bin/incusd. How are you then running that?
The most common way is either to completely stop the system-wide daemon and start yours manually, or to stop the system wide daemon and replace the system binary with yours, then start it back up.
So you ran
makeand got a newincusdin~/go/bin/incusd. How are you then running that? The most common way is either to completely stop the system-wide daemon and start yours manually, or to stop the system wide daemon and replace the system binary with yours, then start it back up.
Right now I evoke incus admin shutdown to turn off the daemon, then I call systemctl restart incus to turn the daemon back on. Then proceed with the rest of the command. Is this the right approach?
Related to the issue itself, so far I have modified nic_physical in validateConfig(), Start() , Stop() , PostStop() by porting over code from nic_bridged and having that code be guarded by if checks making sure the parent is a bridge. I wasn't completely sure if I needed to modify the configuration checks or not in validateConfig(), since that could be considered part of "bridge attachment logic".
Right now I evoke
incus admin shutdownto turn off the daemon, then I callsystemctl restart incusto turn the daemon back on. Then proceed with the rest of the command. Is this the right approach?
That's fine so long as you also substitute the system binary for the one you just buit.
If using the Zabbly package, it's at /opt/incus/bin/incusd, if using another distribution, it may be directly under /usr/bin or under /usr/lib/incus or something like that.
Related to the issue itself, so far I have modified
nic_physicalinvalidateConfig(),Start(),Stop(),PostStop()by porting over code fromnic_bridgedand having that code be guarded by if checks making sure the parent is a bridge. I wasn't completely sure if I needed to modify the configuration checks or not invalidateConfig(),since that could be considered part of "bridge attachment logic".
So one thing worth noting here, the only case where we should attempt bridge attachment within nic_physical is if d.network != nil, that is, we are dealing with an Incus managed network.
We do not want someone to start messing with bridges by doing incus config device add NAME eth0 nic nictype=physical parent=br0. (Which would be the case where d.network == nil).
To get back to your validateConfig question. I don't think that any change should be needed there as we're only going to hit nic_physical through the managed code path and that typically doesn't provide you with much in the way of direct configuration on the nic.
So one thing worth noting here, the only case where we should attempt bridge attachment within
nic_physicalis ifd.network != nil, that is, we are dealing with an Incus managed network.We do not want someone to start messing with bridges by doing
incus config device add NAME eth0 nic nictype=physical parent=br0. (Which would be the case whered.network == nil).To get back to your
validateConfigquestion. I don't think that any change should be needed there as we're only going to hitnic_physicalthrough the managed code path and that typically doesn't provide you with much in the way of direct configuration on the nic.
Okay so don't worry about validateConfig and add an extra check for d.network != nil
Presently my start does the following things:
- configures a VEth pair (container) or a TAP (VM)
- Rebuild dnsmasq config if a managed bridge
- Applying host-side routes and limits
- Disable IPv6 on veth interface
- network filters
- Disable router advertisement/acceptance , enable port isolation
- Setup VLAN settings on bridge
- Check and Enable Hairpin mode
But because of what you've said about limiting how much direct configuration we will provide, maybe some of these steps are unnecessary. For example the documentation doesn't show any options for "security" options. So we shouldn't be setting up filters and assume that was configured before hand externally?
Right now I evoke
incus admin shutdownto turn off the daemon, then I callsystemctl restart incusto turn the daemon back on. Then proceed with the rest of the command. Is this the right approach?That's fine so long as you also substitute the system binary for the one you just buit. If using the Zabbly package, it's at
/opt/incus/bin/incusd, if using another distribution, it may be directly under/usr/binor under/usr/lib/incusor something like that.
Btw this did the trick and I found out my implementation no longer reproduces the issue