Flatcar
Flatcar copied to clipboard
Azure Flatcar qdisc issue
In Azure when it comes to Linux performance networking we use SR-IOV with Mellanox drivers (mlx4 or mlx5), something specific to Azure is that this creates two interfaces a synthetic and a virtual interface, documentation about it can be found here : https://learn.microsoft.com/en-us/azure/virtual-network/accelerated-networking-how-it-works.
I believe the actual bond is create by the hv_netvsc kernel module and as we can see in the output below the enP* interface is picked up by the OS as a stand-alone interface and gets a qdisc attached ot it (Azure flatcar LTS VM image output):
ip address show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
3: enP50947s1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 **qdisc mq master eth0 state UP** group default qlen 1000
tc qdisc show
qdisc noqueue 0: dev lo root refcnt 2
qdisc mq 0: dev eth0 root
qdisc fq_codel 0: dev eth0 parent :1 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc mq 0: dev enP50947s1 root
qdisc fq_codel 0: dev enP50947s1 parent :1 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
I believe this to be a faulty configuration in Azure flatcar VMs using SR-IOV (accelerated networking) as we usually do not apply queuing disciplines to bridged or bonded interfaces like docker0 or virbr0.
ip a s | egrep '(eth0|docker|br-88dd68ef9e6a)'
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 10000
5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
57: br-88dd68ef9e6a: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
61: vethe268329@if60: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br-88dd68ef9e6a state UP group default
This has other implications on how systemd applies the default setting for net.core.default_qdisc too, misusing the file /lib/sysctl.d/50-default.conf.
One of the (simple) fixes that I found was to apply the below “tuned” udev configuration for interface queuing disciplines:
/etc/udev/rules.d/99-azure-qdisc.rules
ACTION=="add|change", SUBSYSTEM=="net", KERNEL=="enP*", PROGRAM="/sbin/tc qdisc replace dev $env{INTERFACE} root noqueue"
ACTION=="add|change", SUBSYSTEM=="net", KERNEL=="eth*", PROGRAM="/sbin/tc qdisc replace dev $env{INTERFACE} root fq maxrate 12.5gbit limit 100000"
Specifically for my tests I’ve set the maxrate for the fq queuing discipline to match the VM SKU interface line speed of 12.5 Gbit and I’ve upped the limit of processed packets to 100K since the limits set upstream are a bit random and arbitrary and may not be the best choice for running VMs in the cloud and expecting higher networking performance.