ilp-connector
ilp-connector copied to clipboard
Increase hold_down_time and use unreachable_through_me field
The current routing algorithm relies on refreshing routes every 30 seconds. This allows us to automatically forget routes when they go down, this is an expensive way to detect server failure.
A more efficient approach would be to set hold_down_time to infinity, and then when a server detects that its peer is down (for instance, make a call to its /api/health end-point every 10 seconds), it tells its other peers which routes became unreachable.
If a server is shut down gracefully then it can even send this 'unreachable_through_me' update itself.
Could you explain what is expensive in the current situation, and how hitting an endpoint every 10 seconds is cheaper?
ah, I guess if you still make the calls to /api/health then that's pretty much equivalent to the current heartbeat system, just in the other direction :) it gets more efficient though if you only mark a peer as down when they are unresponsive while sending it a transfer.