exo icon indicating copy to clipboard operation
exo copied to clipboard

prioritise thunderbolt over WIFI #41

Open itsknk opened this issue 1 year ago • 17 comments

For issue #41, I figured the solution could be quite simple since it already supports the Thunderbolt discovery and needs to be prioritized.

I've updated the grpc_discovery.py file to get this done.

Changes made:

  • Import netifaces and platform.
  • Create a get_network_interfaces method that prioritizes Thunderbolt interfaces over WiFi on macOS.
  • Make the __init__ method to call get_network_interfaces and store the result.
  • Update task_broadcast_presence to broadcast on all available interfaces, prioritizing Thunderbolt on macOS.
  • Make task_listen_for_peers to listen on all available interfaces.

itsknk avatar Jul 20 '24 06:07 itsknk

This is awesome! First proper PR, congrats!

Code looks great, only one very minor thing for consistency across the codebase, instead of platform use psutil. It's the same thing but just more consistent across the codebase.

AlexCheema avatar Jul 20 '24 07:07 AlexCheema

Will change.

itsknk avatar Jul 20 '24 07:07 itsknk

Also don't we need to add netifaces as a dependency in setup.py?

AlexCheema avatar Jul 20 '24 07:07 AlexCheema

My bad! We do! I just installed it locally and forgot about setup.py! Will update the whole thing and give you a better one!

itsknk avatar Jul 20 '24 07:07 itsknk

Just added an integration test for discovery. Already coming in useful, we'll see if it still passes after your changes.

AlexCheema avatar Jul 20 '24 07:07 AlexCheema

I guess setting up netifaces dependency failed?!

itsknk avatar Jul 20 '24 07:07 itsknk

I just quickly tested this myself locally. For me, the interface name does not begin with thunderbolt.

# before thunderbolt connected
>>> netifaces.interfaces()
['lo0', 'gif0', 'stf0', 'anpi0', 'anpi1', 'anpi2', 'en4', 'en5', 'en6', 'en1', 'en2', 'en3', 'bridge0', 'ap1', 'en0', 'awdl0', 'llw0', 'utun0', 'utun1', 'utun2', 'utun3', 'utun4', 'utun5', 'utun6', 'utun7', 'utun8', 'utun9']
# after thunderbolt connected
>>> netifaces.interfaces()
['lo0', 'gif0', 'stf0', 'anpi0', 'anpi1', 'anpi2', 'en4', 'en5', 'en6', 'en1', 'en2', 'en3', 'bridge0', 'ap1', 'en0', 'awdl0', 'llw0', 'utun0', 'utun1', 'utun2', 'utun3', 'utun4', 'utun5', 'utun6', 'utun7', 'utun8', 'utun9', 'en11', 'en16']

As you can see, en11 and en16 were added after connecting thunderbolt.

AlexCheema avatar Jul 20 '24 23:07 AlexCheema

Eventually we should support automatically upgrading to the best interface available (in this case thunderbolt) - see #51. We can implement this later though, for now this is great.

AlexCheema avatar Jul 20 '24 23:07 AlexCheema

Eventually we should support automatically upgrading to the best interface available (in this case thunderbolt) - see #51. We can implement this later though, for now this is great.

This makes better sense, @AlexCheema.

itsknk avatar Jul 20 '24 23:07 itsknk

Just to clarify, this still needs to be fixed / tested: https://github.com/exo-explore/exo/pull/47#issuecomment-2241334740

AlexCheema avatar Jul 21 '24 00:07 AlexCheema

Update:

  • Look for Thunderbolt in both the hardware and type fields, as Thunderbolt interfaces might be identified in either.
  • Identify WiFi interfaces by checking if the type is 'wi-fi' (case-insensitive).
  • Categories all other interfaces as 'other'.

@AlexCheema, I wrote a test script and it returned:

Screenshot 2024-07-20 at 11 42 16 PM

Please review this.

itsknk avatar Jul 21 '24 06:07 itsknk

Have you got a thunderbolt cable to test this with? I don't think this works in its current state. This is what system_profiler SPNetworkDataType -json gives me for my MacBook Pro's Thunderbolt interface:

{
      "_name": "Thunderbolt Bridge",
      "Ethernet": {
        "MediaOptions": [],
        "MediaSubType": "autoselect"
      },
      "hardware": "Ethernet",
      "interface": "bridge0",
      "ip_address": [
        "169.254.13.209"
      ],
      "IPv4": {
        "AdditionalRoutes": [
          {
            "DestinationAddress": "169.254.13.209",
            "SubnetMask": "255.255.255.255"
          }
        ],
        "Addresses": [
          "169.254.13.209"
        ],
        "ConfigMethod": "DHCP",
        "ConfirmedInterfaceName": "bridge0",
        "InterfaceName": "bridge0",
        "SubnetMasks": [
          "255.255.0.0"
        ]
      },
      "IPv6": {
        "ConfigMethod": "Automatic"
      },
      "Proxies": {
        "ExceptionsList": [
          "*.local",
          "169.254/16"
        ],
        "FTPPassive": "yes"
      },
      "spnetwork_service_order": 4,
      "type": "Ethernet"
    }

AlexCheema avatar Jul 21 '24 07:07 AlexCheema

I had another look and generally I don't think this is doing what we want it to do. I'm not even sure the behaviour is different to before the changes. We're broadcasting on all interfaces (was already the case as far as I can tell) and listening on all interfaces (was already the case).

I think what we need to do here:

  • In the broadcast message, add a priority field which indicates the device's preference to receive connections on this particular interface
  • the listener should keep track of the current priority for each peer (can be in the same dict as known_peers)
  • if the listener receives a broadcast with a higher priority than it currently has for a connected peer, it should disconnect on the old interface and reconnect on the new interface

AlexCheema avatar Jul 21 '24 07:07 AlexCheema

Did:

  • A priority field is added to the broadcast message.
  • Modified known_peers to include the priority info.
  • Implemented reconnection logic in on_listen_message when a higher priority interface is detected.
  • get_network_interfaces returns interfaces with their priorities.
  • A reconnect_peer method is added to handle the reconnection.

@AlexCheema, wrote a small test script that simulates receiving a broadcast from a peer. And to see whether it keeps track of priority and change when a better one comes. Here is the output from the test:

Screenshot 2024-07-21 at 4 09 55 PM

itsknk avatar Jul 21 '24 23:07 itsknk

Did:

  • A priority field is added to the broadcast message.
  • Modified known_peers to include the priority info.
  • Implemented reconnection logic in on_listen_message when a higher priority interface is detected.
  • get_network_interfaces returns interfaces with their priorities.
  • A reconnect_peer method is added to handle the reconnection.

@AlexCheema, wrote a small test script that simulates receiving a broadcast from a peer. And to see whether it keeps track of priority and change when a better one comes. Here is the output from the test:

Screenshot 2024-07-21 at 4 09 55 PM

@itsknk can you commit the test too? and add some assertions so it's a proper test would be great

AlexCheema avatar Jul 21 '24 23:07 AlexCheema

Did:

  • A priority field is added to the broadcast message.
  • Modified known_peers to include the priority info.
  • Implemented reconnection logic in on_listen_message when a higher priority interface is detected.
  • get_network_interfaces returns interfaces with their priorities.
  • A reconnect_peer method is added to handle the reconnection.

@AlexCheema, wrote a small test script that simulates receiving a broadcast from a peer. And to see whether it keeps track of priority and change when a better one comes. Here is the output from the test: Screenshot 2024-07-21 at 4 09 55 PM

@itsknk can you commit the test too? and add some assertions so it's a proper test would be great

Done @AlexCheema

itsknk avatar Jul 21 '24 23:07 itsknk

Hey @AlexCheema, yeah it does look a little complex. After some understanding, I thought of some steps. Please do verify these, and let me know if I'm on the right track.

  • Let GRPCDiscovery class to handle discovering peers and maintaining their information, without any connection logic.
  • The StandardNode class will handle the connection logic, including priority-based reconnections.
  • The update_peers method in StandardNode can check for new peers or higher priority connections and manage the connections accordingly.
  • The periodic topology collection remains, ensuring the node regularly updates its peer connections based on the latest discovery information.

itsknk avatar Jul 22 '24 23:07 itsknk

@itsknk

First of all, thanks a lot for these contributions - experimentation helps a lot to inform future decisions even if things don't get merged. I end up throwing away most the code I write.

There have been quite a few changes since this PR (health checks, broadcast on multiple interfaces, priorities).

There's now a priority that is sent in broadcast messages. Connections for a given node_id are upgraded to the highest priority connection that is healthy.

I think this PR might need to be closed now, but I'm happy to send you a retrospective $100 bounty for your work and it helped inform some of the decisions made in the repo.

If you like, we are currently setting priority to 1 for every kind of interface. If we have a good way to assign meaningful priorities (higher is better) e.g. Thunderbolt > WiFi that would be great.

AlexCheema avatar Oct 03 '24 21:10 AlexCheema