netavark Container with podman network not receiving UDP traffic

Issue Description

Upon running a simple python server container listening on a UDP socket with an attached podman network, UDP traffic that is being sent to the port does not arrive.

Versions 5.2.0-dev-5d10f77da and 4.9.4-rhel both were tried with the same results.

This is a MRE of the issue we are having in production. Docker is fine, podman+cni is fine, podman+netavark exhibits this issue. Note restarting our UDP devices or changing the source port is very cumbersome and we wish to avoid this.

Steps to reproduce the issue

Create Dockerfile

FROM python:latest
WORKDIR /usr/local/bin
COPY server.py . 
CMD ["chmod", "+x", "server.py"]
CMD ["server.py"]

The corresponding server script

#!/bin/python3
import socket

server_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
server_socket.bind(('', 17000))

while True:
    message, address = server_socket.recvfrom(1024)
    print(f"resceived from: {address}: {message}", flush = True)

podman build . -t podman_udp_test
podman network create podman_udp
Start sending UDP traffic to port 17000 etc with nping: nping -g 17580 -p 17000 -c 1000000 --udp 127.0.0.1
podman run -p 17000:17000/udp --net podman_udp_network podman_udp_testing

Describe the results you received

No output

Describe the results you expected

Output from the server after receiving packets

podman info output

host:
  arch: amd64
  buildahVersion: 1.33.8
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - hugetlb
  - pids
  - rdma
  - misc
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.10-1.el9.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.10, commit: fb8c4bf50dbc044a338137871b096eea8041a1fa'
  cpuUtilization:
    idlePercent: 99.38
    systemPercent: 0.28
    userPercent: 0.35
  cpus: 4
  databaseBackend: sqlite
  distribution:
    distribution: rhel
    version: "9.4"
  eventLogger: journald
  freeLocks: 2032
  hostname: ccms-pod
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.14.0-427.18.1.el9_4.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 640634880
  memTotal: 8058433536
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.10.0-3.el9_4.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.10.0
    package: netavark-1.10.3-1.el9.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.10.3
  ociRuntime:
    name: crun
    package: crun-1.14.3-1.el9.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.14.3
      commit: 1961d211ba98f532ea52d2e80f4c20359f241a98
      rundir: /run/user/0/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  pasta:
    executable: ""
    package: ""
    version: ""
  remoteSocket:
    exists: false
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.3-1.el9.x86_64
    version: |-
      slirp4netns version 1.2.3
      commit: c22fde291bb35b354e6ca44d13be181c76a0a432
      libslirp: 4.4.0
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.2
  swapFree: 5367644160
  swapTotal: 5368705024
  uptime: 583h 32m 27.00s (Approximately 24.29 days)
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.access.redhat.com
  - registry.redhat.io
  - docker.io
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 5
    paused: 0
    running: 5
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /var/lib/containers/storage
  graphRootAllocated: 47173337088
  graphRootUsed: 20552769536
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "true"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 34
  runRoot: /run/containers/storage
  transientStore: false
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 4.9.4-rhel
  Built: 1719829634
  BuiltTime: Mon Jul  1 18:27:14 2024
  GitCommit: ""
  GoVersion: go1.21.11 (Red Hat 1.21.11-1.el9_4)
  Os: linux
  OsArch: linux/amd64
  Version: 4.9.4-rhel

Podman in a container

No

Privileged Or Rootless

Privileged

Upstream Latest Release

Yes

Additional environment details

There was no difference from running nping on localhost versus running it on a different machine that can access the podman container

Additional information

Starting the python server first and then starting the UDP sender works as expected but this doesn't help our use case.

Stopping and restarting the UDP sender program while the container is running doesn't help. Only by changing the source port of the UDP sender program does traffic start being received, but we cannot easily change the source port of the UDP traffic.

Aug 01 '24 10:08 booleanvariable

This is likely because we do not change any conntack entries in netavark. We must call into the kernel netlink API to drop the stale entries and last I check our netlink did not have any support for conntack types so we would need to implement the types from scratch which is a lot of work. In any case this is a netavark issue so I move it there.

Note if you are RHEL user it is best to report this through the Red Hat support channels so this can get better prioritized.

Aug 02 '24 09:08 Luap99

Is there a work around possible?

Aug 05 '24 01:08 booleanvariable

manually clear conntrack entries (assuming that is actually causing the issue you are having)

Aug 05 '24 09:08 Luap99

I am having this same issue after restarting a pod that uses quadlets (systemctl restart app-pod.service).

I was able to work around it by manually clearing the conntrack entries as suggested.

conntrack -L conntrack  | grep 514
conntrack -D conntrack --proto udp --orig-src 192.168.20.1 --orig-dst 192.168.20.2 --sport 514 --dport 5141

Any way I can help troubleshoot why this is happening, and help fix it, so this work around isn't required?

Dec 27 '24 18:12 woodsb02

I am having this same issue after restarting a pod that uses quadlets (systemctl restart app-pod.service).

I was able to work around it by manually clearing the conntrack entries as suggested.
conntrack -L conntrack  | grep 514
conntrack -D conntrack --proto udp --orig-src 192.168.20.1 --orig-dst 192.168.20.2 --sport 514 --dport 5141
Any way I can help troubleshoot why this is happening, and help fix it, so this work around isn't required?

Manually clearing the conntrack entries was an acceptable workaround for us in production. Otherwise I cannot offer any further information about this. Sorry

Jan 06 '25 00:01 booleanvariable

Any way I can help troubleshoot why this is happening, and help fix it, so this work around isn't required?

It happens because the kernel keeps the conntrack kernel around for a while. Not sure on the time but it is not important.

What needs to happen is for netavark to learn how to flush these entries on setup/teardown. And this requires us to talk to the proper kernel APIs like conntrack does. Calling the conntrack command from netavark would not seem acceptable to me.

Jan 06 '25 17:01 Luap99

Does the netavark team have any desire to actually fix this issue? This is plaguing my environment as I rely heavily on SNMP collection (traps) as well as Syslog via containers and with the push to use Podman on RHEL and CNI being dropped in favor of Netavark but UDP not working... this seems oddly low priority for yall.

Jul 17 '25 23:07 luckenbach

This has never made it the top priority in our planning so at least in the near future we have no plans to work on this.

That doesn't mean that I don't want to see this fixed but so far at least other work items have been seen as more important. Anyone can take on the work to fix this if they want, we happily accept contributions.

And in case you are a RHEL customer file a support request asking for this feature/fix. The more customers that ask for it the more likely it is that it gets ranked higher on our list.

Jul 18 '25 09:07 Luap99

Hi @booleanvariable !! I am interested to contribute to this project in LFX. Can you recommend some good first issues to work on to contribute to code base and understand the project better ?

Jul 25 '25 17:07 ZORDxDD

Hey @Luap99 & @mheon I would like to be part of this project under lfx term 3 can u please help me start contributing so that I get a deep knowledge about the codebase I have great knowledge of c language and also contributing and connecting with cncf project @cilium networking project so I am familiar with ip or networking concepts clearly! talking about rust just explored the basics not yet deep dived but I am very eager to have some guidance to improve my rust skills genuinely asking ! So please help me to get forward with code contribution! Thx and any channel like slack so we can have a conversation when needed ! ?

Jul 25 '25 17:07 RayyanSeliya

Hi @booleanvariable , I’m interested in contributing to this project through the LFX mentorship program. I’ve gone through the repository and would love to start contributing. Could you kindly suggest some good first issues or beginner-friendly tasks that I can work on to get started?

Jul 25 '25 18:07 Pranshu-collab

Hi,

Is implementing BPF (Berkeley Packet Filter) as in-kernel filtering mechanism for conntrack events in the scope here? My reading of the issue makes me think the goal is just to directly delete entries via netlink.

Why does netavark use netlink-sys instead of netlink-proto which is an asynchronous implementation of the netlink protocol? ref. Is there a specific design reason or history there?

For context, I'm asking because I'm very interested in implementing this for the LFX mentorship program. I've sent a more detailed plan to @mheon and @Luap99 by email for their review.

Thanks

Jul 29 '25 15:07 shivkr6

Sorry for delay all, please apply through the official portal if you are interested, I think the applications should open later today. https://mentorship.lfx.linuxfoundation.org/project/07efb861-3c5b-4bc2-9986-593656750ffc

As for easy issues it is hard gor me to judge, https://github.com/containers/netavark/issues/1258 might be one.

Is implementing BPF (Berkeley Packet Filter) as in-kernel filtering mechanism for conntrack events in the scope here? My reading of the issue makes me think the goal is just to directly delete entries via netlink.

We don't want a persistent process, we really only aim for flushing/deleting the entry for the given container port so the udp traffic will be redirected accordingly.

Why does netavark use netlink-sys instead of netlink-proto which is an asynchronous implementation of the netlink protocol? ref. Is there a specific design reason or history there?

Async is not a good fit for us because all we do is synchronous anyway for the most part, so the async code would just look make async call/ wait for async call to finish which in itself just adds the overhead of the event loop the async runtime uses. In fact we had used the async parts in the past https://github.com/containers/netavark/commit/96993f4f94a04b079f083650a8c5767c8f5c5fb3

Jul 31 '25 09:07 Luap99