nebula
nebula copied to clipboard
Misbehaving relay
When I try to use relay to connect two machines behind NAT, the behavior is somewhat questionable...
- If relaying is disabled all systems, Nebula can establish direct connections between the two devices behind NAT.
- If relaying is enabled everywhere, Nebula always uses relay, even though it's not necessary (direct connection is possible, see above).
- If relaying is enabled on one of the two devices, the one with relay enabled becomes unreachable over Nebula.
Here's my setup:
- One system on AWS t2.micro (Linux x86_64): has public IP, serves as lighthouse and relay
- One laptop (MacOS arm64): no public IP
- One desktop (Linux x86_64): no public IP
Config on AWS:
relay:
am_relay: true
use_relays: true
Config on laptop:
relay:
am_relay: true
use_relays: true # also tried false, see above
Config on desktop:
relay:
am_relay: true
use_relays: true # also tried false, see above
Seems like this issue doesn't exist if I disable unsafe_routes
across the board.
You should only set relay.am_relay
on the relay host. Laptop and Desktop should have relay.am_relay
set to false
.
graph LR
A[Laptop<br>ip:192.168.0.2<br>relay.am_relay:false<br>relay.use_relays:true<br>relay.relays: 192.168.0.1] --> B(Relay Host AWS<br>ip:192.168.0.1<br>relay.am_relay:true<br>relay.use_relays:ignored<br>relay.relays:ignored)
B --> C[Desktop<br>ip:192.168.0.3<br>relay.am_relay:false<br>relay.use_relays:true<br>relay.relays:192.168.0.1]
If relaying is enabled everywhere, Nebula always uses relay, even though it's not necessary (direct connection is possible, see above).
Relays are an augmentation to existing Nebula connections, so they shouldn't interfere with direct connections. However, it's possible that a relay is established before a direct connection is established. When that happens, Nebula will attempt to create a direct connection after seeing some thousands of packets.
If relaying is enabled on one of the two devices, the one with relay enabled becomes unreachable over Nebula.
Could you share the config from the three hosts that causes this issue to happen? If I understand, you have a Laptop and a Desktop on the same LAN, and a lighthouse in AWS with an internet IP. Let me know the config you're running with, and I'll try to repro the failure.
Your diagram above is correct. The laptop (MacOS) and the desktop (RHEL 9) is on the same LAN, and the lighthouse is on AWS with a public IPv4 and IPv6 address.
Here's my configs: aws.yml.txt laptop.yml.txt linux_desktop.yml.txt
Thanks for the configs @chrisx8 ! You reported that if only one host has relays enabled, it becomes unreachable. Are you still seeing that behavior?
If so, please provide the configs that reproduce that error, along with some Nebula logs covering the connection attempt.
I don't see any issues in the configs you provided, and a similar setup is working OK for me, so the logs and configs will help me debug further.
@chrisx8 It's been a while since we've heard from you so I'm going to close this issue out as stale. If you're continuing to experience issues please get back to us with the request info, then ping me on this thread and I'll reopen it. Thanks!