Wrong source IP for udp packet generation
Describe the bug
GeyserMC not works properly on multiple IP servers.
I have two NIC on my PC
NIC1: IP: 192.168.1.100 gw 192.168.1.1
NIC2: IP: 192.168.2.100 gw 192.168.2.1
Routes:
default via 192.168.1.1 table main
default via 192.168.2.1 table 2222
Policy routing:
0: from all lookup local
32765: from 192.168.2.100 lookup 2222
32766: from all lookup main
The bug: GeyserMC reply with wrong source address
tcpdump logs from server
root@mc-server /# tcpdump -nnni any "host 61.231.185.43 and port 19132"
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked v1), capture size 262144 bytes
05:26:40.055019 IP 61.231.185.43.49985 > 192.168.2.100.19132: UDP, length 33
05:26:40.057919 IP 192.168.1.100.19132 > 61.231.185.43.49985: UDP, length 141
05:26:40.058025 IP 61.231.185.43.49985 > 192.168.2.100.19132: UDP, length 33
05:26:40.058831 IP 192.168.1.100.19132 > 61.231.185.43.49985: UDP, length 141
05:26:42.692853 IP 61.231.185.43.49985 > 192.168.2.100.19132: UDP, length 33
05:26:42.693533 IP 192.168.1.100.19132 > 61.231.185.43.49985: UDP, length 141
client sent packet to 192.168.2.100.19132 but server response with source ip 192.168.1.100.19132
To Reproduce
Method 1
Same as above. Prepare two NIC on server, setup two routing table and policy routing on it.
Method 2
Setup multiple IP in same NIC
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 00:0c:29:0b:fa:04 brd ff:ff:ff:ff:ff:ff
inet 172.22.3.100/20 scope global enp0s3
valid_lft forever preferred_lft forever
inet 172.22.3.101/20 scope global secondary enp0s3
valid_lft forever preferred_lft forever
And the tcpdump logs:
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked v1), capture size 262144 bytes
05:29:21.824633 IP 172.22.2.40.64779 > 172.22.3.101.19132: UDP, length 548
05:29:21.825535 IP 172.22.3.100.19132 > 172.22.2.40.64779: UDP, length 32
This bug happens is because linux selects source address with following rules:
- User specified (with
bindorwrite) - Tracked by kernel (not happen for udp listen)
- Hit routes by
destination IP3-1. Ifpref-srcis set on the route, use it 3-2. if the route points a NIC, select an IP from it 3-3. If multiple IP on the NIC, use the ip marked as major.
In my first case, it hitted at rule 3-2. Because I have two routing table both covers client IP 61.231.185.43, and it hit the first table. So that linux select an IP from NIC1 as the source address, but the packet is come from NIC2 and should hit another table.
In my second case, it hits rule 3-3, multiple IP on same NIC. And 172.22.3.101 is marked as secondary, so the kernel just selects 172.22.3.100
TCP socket get covered by system at rule 2 but udp is not. If the selection hits rule 3, it's basically guess. Because it lost the information of which IP is the client connect to. It doesn't cause issue as client application cause server always send back whatever, but cause issue for a server application.
Expected behaviour
Similar to wireguard or etc udp based server, track udp session manually and reply with correct source address.
different from TCP session, system won't track udp session for user, we have to handle it by our self.
We have to bind specfic source IP on udp each session otherwise system just select the major IP from NICs based on default table. So that it brokes at system have multiple route table or secondary IP on the NIC.
In most case it won't be issue but I have multiple routing table on my system which causes problem. It also cause problem for server have multiple IP address. This is very common on VPS or server which allowing us to purchase additional IP, the IP marked as secondary in the NIC are not useable at geyser.
Solution 1:
We should check dst IP for incoming packet, and bind it as src IP for the respons socket(each client needs one socket).
wireguard-go uses this solution. It saves the dst ip for incoming packets for each peers with custom designed sticky socket.
Solution 2:
Allowing us config multiple bind address in the config and spawn multiple udp socket for each address instead of bind 0.0.0.0 with one socket.
So that each socket only listens packets from corresponding NIC and we can send response packet with corresponding socket to make sure the src IP is correct.
bind9 uses this solution. It doesn’t listen 0.0.0.0, but searches all IP on all NIC and create a socket for each of them.
Screenshots / Videos
No response
Server Version and Plugins
No response
Geyser Dump
https://dump.geysermc.org/kCwMeUAIXZxu4OtZHMazGI2WF60M8hiI
Geyser Version
5.2.3-SNAPSHOT (b109-49bd564)
Minecraft: Bedrock Edition Device/Version
No response
Additional Context
No response
Tring to fix this issue but get stucked now. https://github.com/GeyserMC/Geyser/commit/51c42d6a935582d45b0f5040ab788a4889c50d01
We can get the recipient IP by packet.recipient() and save it to bedrock sessions.
Next step is bind(recipientip) before connect() in UDP socket level, or use write() instead.
Not sure how to achieve this now.
You're not going to be able to fix that within the Geyser code since it's lower down in the network stack. I've identified the issue and am currently writing a fix.
Hello, if you are on Linux, (assuming you are defaulting to epoll, you can check with debug mode) could you try with this build: https://github.com/Kas-tle/Geyser/actions/runs/11324559312
Or you can apply this patch to your local repo:
curl https://github.com/Kas-tle/Geyser/commit/c3ac60c0e4627050d5934f1162c1347609cad11a.patch | git apply -v
Issue should be resolved with https://github.com/CloudburstMC/Network/commit/78d16f7b99ed728753a954deede1398cfb5f63c7, which is used with the latest Geyser version. Thanks for reporting the issue!