exo
exo copied to clipboard
More robust networking
- Automatically select the best interface/networking for a given node. e.g. we should prioritise thunderbolt over WiFi and when that becomes available, automatically switch over
- More quickly detect when a peer is unavailable. Right now you have to wait for the discovery module to cleanup the peer which takes ~20 secs. Ideally it's updated as soon as it becomes unavailable
- On the fly re-routing. If a node disconnects during a request, that request will fail. We should automatically update the topology and re-route in realtime so requests recover.
Previous PR: #194
- one more case: node can stay alive but GPU may stuck, so need also determine no response from backend/call timeout