python-nostr icon indicating copy to clipboard operation
python-nostr copied to clipboard

refactor relay threads

Open jeffthibault opened this issue 2 years ago • 6 comments

Relay class

  • Add thread instantiation to connect method
  • Add is_connected method which checks if the ws.run_forever thread is alive. I think this is more reliable because this thread will be torn down on disconnect.
  • Various renaming

Relay Manager

  • Remove thread instantiation from add_relay (handled in relay.connect now)
  • Add relay_connection_monitor thread, which periodically (connection_monitor_interval_secs) checks if the relays are disconnected and reconnects them if they are.

Note: I felt like reconnection logic belongs in the relay manager as it's job is to manage the relays

jeffthibault avatar Feb 17 '23 13:02 jeffthibault

@callebtc can you give this a review if you get a chance?

jeffthibault avatar Feb 17 '23 13:02 jeffthibault

Also @kdmukai, can you give this a review if you have a chance?

jeffthibault avatar Feb 19 '23 21:02 jeffthibault

Having a separate monitor thread seems much better than triggering reconnect in _on_error(). That approach was hitting a recursion limit for me.

On suggestion: Instead of a fixed connection_monitor_interval_secs, it would be better to have an exponential backoff strategy. Otherwise this client will hopelessly try to reconnect to a down server every ~5 seconds (default), which is spammy in the logs at best. There are libraries out there to do this...or it should be simple to roll your own. Something like, which caps out at 5 mins indefinitely: 5, 10, 30, 60, 300, 300, 300... The backoff should be per-relay, though

jeremywhelchel avatar Feb 21 '23 14:02 jeremywhelchel

Overall, yes, I think the RelayManager should manage the reconnects.

I haven't run this PR yet, but the biggest need here is just a ton of code comments to explain what's going on, overview of who manages what, what each thread is for, when threads block, what cleanup guarantees we have (e.g. is daemon=True infallible?).

kdmukai avatar Feb 21 '23 14:02 kdmukai

I think error handling is an issue (see my comment) but apart from that LGTM. I agree with @kdmukai that the way Threads a distributed across files could be confusing but I think this way is better than before where both threads were launched from the relay manager.

It makes more sense that a Relay has its own Queue thread instead of running side by side with it.

Could not test yet!

callebtc avatar Feb 21 '23 15:02 callebtc

On suggestion: Instead of a fixed connection_monitor_interval_secs, it would be better to have an exponential backoff strategy. Otherwise this client will hopelessly try to reconnect to a down server every ~5 seconds (default), which is spammy in the logs at best. There are libraries out there to do this...or it should be simple to roll your own. Something like, which caps out at 5 mins indefinitely: 5, 10, 30, 60, 300, 300, 300... The backoff should be per-relay, though

@jeremywhelchel I like this idea but if the backoff strategy is per relay, how would the reconnection monitor thread on the relay manager be able to handle that? It seems like we would need a reconnection monitor thread for each relay.

jeffthibault avatar Feb 26 '23 00:02 jeffthibault