nebula icon indicating copy to clipboard operation
nebula copied to clipboard

Misbehaving relay

Open chrisx8 opened this issue 2 years ago • 4 comments

When I try to use relay to connect two machines behind NAT, the behavior is somewhat questionable...

  • If relaying is disabled all systems, Nebula can establish direct connections between the two devices behind NAT.
  • If relaying is enabled everywhere, Nebula always uses relay, even though it's not necessary (direct connection is possible, see above).
  • If relaying is enabled on one of the two devices, the one with relay enabled becomes unreachable over Nebula.

Here's my setup:

  • One system on AWS t2.micro (Linux x86_64): has public IP, serves as lighthouse and relay
  • One laptop (MacOS arm64): no public IP
  • One desktop (Linux x86_64): no public IP

Config on AWS:

relay:
  am_relay: true
  use_relays: true

Config on laptop:

relay:
  am_relay: true
  use_relays: true  # also tried false, see above

Config on desktop:

relay:
  am_relay: true
  use_relays: true  # also tried false, see above

chrisx8 avatar Jul 03 '22 21:07 chrisx8

Seems like this issue doesn't exist if I disable unsafe_routes across the board.

chrisx8 avatar Jul 03 '22 21:07 chrisx8

You should only set relay.am_relay on the relay host. Laptop and Desktop should have relay.am_relay set to false.

graph LR
    A[Laptop<br>ip:192.168.0.2<br>relay.am_relay:false<br>relay.use_relays:true<br>relay.relays: 192.168.0.1] --> B(Relay Host AWS<br>ip:192.168.0.1<br>relay.am_relay:true<br>relay.use_relays:ignored<br>relay.relays:ignored)
    B --> C[Desktop<br>ip:192.168.0.3<br>relay.am_relay:false<br>relay.use_relays:true<br>relay.relays:192.168.0.1]

If relaying is enabled everywhere, Nebula always uses relay, even though it's not necessary (direct connection is possible, see above).

Relays are an augmentation to existing Nebula connections, so they shouldn't interfere with direct connections. However, it's possible that a relay is established before a direct connection is established. When that happens, Nebula will attempt to create a direct connection after seeing some thousands of packets.

If relaying is enabled on one of the two devices, the one with relay enabled becomes unreachable over Nebula.

Could you share the config from the three hosts that causes this issue to happen? If I understand, you have a Laptop and a Desktop on the same LAN, and a lighthouse in AWS with an internet IP. Let me know the config you're running with, and I'll try to repro the failure.

brad-defined avatar Jul 06 '22 14:07 brad-defined

Your diagram above is correct. The laptop (MacOS) and the desktop (RHEL 9) is on the same LAN, and the lighthouse is on AWS with a public IPv4 and IPv6 address.

Here's my configs: aws.yml.txt laptop.yml.txt linux_desktop.yml.txt

chrisx8 avatar Jul 10 '22 01:07 chrisx8

Thanks for the configs @chrisx8 ! You reported that if only one host has relays enabled, it becomes unreachable. Are you still seeing that behavior?

If so, please provide the configs that reproduce that error, along with some Nebula logs covering the connection attempt.

I don't see any issues in the configs you provided, and a similar setup is working OK for me, so the logs and configs will help me debug further.

brad-defined avatar Jul 12 '22 13:07 brad-defined

@chrisx8 It's been a while since we've heard from you so I'm going to close this issue out as stale. If you're continuing to experience issues please get back to us with the request info, then ping me on this thread and I'll reopen it. Thanks!

johnmaguire avatar Dec 07 '22 18:12 johnmaguire