rust-lightning icon indicating copy to clipboard operation
rust-lightning copied to clipboard

Add max_no_channel_peers to UserConfig

Open domZippilli opened this issue 1 year ago • 11 comments

The MAX_NO_CHANNEL_PEERS constant provides helpful protection from an excessive number of peer connections, but as a constant, isn't user modifiable. The desired number of non-channel peers may vary depending on the application, so here we make it configurable.

domZippilli avatar Oct 24 '24 16:10 domZippilli

How many do y'all want? I'm of half a mind to just 10x it and call it a day.

TheBlueMatt avatar Oct 24 '24 16:10 TheBlueMatt

Today, 2500 -- tomorrow, THE WORLD!

We currently run a fork with 2500. This is presently plenty of headroom, but I might want to tweak it down...

domZippilli avatar Oct 24 '24 16:10 domZippilli

right. depending on the node's purpose and planned usage, a node operator might want to configure this to be higher or lower. Given how this impacts resource usage, it makes sense for this to be a tunable parameter.

br0thersharp avatar Oct 24 '24 16:10 br0thersharp

Given how this impacts resource usage, it makes sense for this to be a tunable parameter.

I'm not entirely sure that it does impact resource usage that much. The maximum number of pre-funded channels definitely does, but no-channel peers are incredibly cheap (in ChannelManager). They may cost bandwidth for gossip, but IMO that is best dealt with at the gossip layer, not in the peer count limit.

TheBlueMatt avatar Oct 28 '24 18:10 TheBlueMatt

no-channel peers are incredibly cheap

Well, would no limit at all be better?

The options I can think of are (1) no limit, (2) a one-size-fits-all limit, or (3) a customizable limit with a sensible default.

domZippilli avatar Oct 28 '24 20:10 domZippilli

Well, would no limit at all be better?

I mean, there is some cost, just not a hell of a lot. Not sure that actually no limit is the right answer given there is some non-zero cost, even if its tiny.

TheBlueMatt avatar Oct 28 '24 20:10 TheBlueMatt

Maybe leave the current limit as default, make it configurable, and then just add to the doc comment there is a low cost to no-channel peers, but may lead to a higher cost at gossip/bandwidth and use responsibly etc.

dunxen avatar Mar 14 '25 13:03 dunxen

Maybe leave the current limit as default, make it configurable, and then just add to the doc comment there is a low cost to no-channel peers, but may lead to a higher cost at gossip/bandwidth and use responsibly etc.

SGTM, of course. I think this PR will get you that, minus the doc comment.

domZippilli avatar Mar 14 '25 13:03 domZippilli

Hey, I wanted to bump this. I'm happy to rebase it or whatever is necessary. I think the bigger blocker is consensus that this should be configurable.

The main thing we don't seem to know/agree on above is the "cost" of a peer. I'd love to bring some data on the cost of a peer so that we can decide on a number, but it's not fixed. Two things can vary:

  • How much your typical peer "costs."
    • For example, last year, our node had an issue with a large population of peers that were sending a from-zero gossip sync every time they connected. We got that to change, and reduced peer-handling resource costs by 99%.
    • The latter is a bug, of course, but any kind of designed behavior could emerge in a Lightning client on either side of peer connection (ourselves, or a peer) that makes this cost increase/decrease.
  • What you are able to "pay."
    • Fairly obvious here, but enterprisey routing nodes or custodial wallets have far greater resource access and can therefore support forwarding gossip or answering pings for a much larger population of non-counterparty peers than say, someone bootstrapping without a lot of financial support. For the latter, a fixed amount of 2500 may be far too high and lead to resource exhaustion.

domZippilli avatar Jul 08 '25 18:07 domZippilli

Okay, finally coming back to this post-0.2, sorry about that.

I'm still not really convinced it makes sense to add a config knob here. IMO config knobs are not free - we generally expect downstream devs to consider the knobs that exist and set them correctly, with sufficient guidance that we hope its somewhat straightforward to do so. For something like max peers, we (LDK) probably have more knowledge about the "cost" of peers than our average downstream dev, implying we should be the one setting it, not them (though that's certainly not universally the case, and I'm sure y'all specifically know way more than we do in this instance!).

You raise specific concerns over bugs which had to have peer count limited, but I don't really think a config knob on peer limits is the right answer there. Rather, in this specific instance the right answer is probably either a custom message handler that limits peers with specific feature flags or an intercepting message handler on gossip messages that disconnects peers when they're flooding us with gossip. In the general case, I'd hope emergency deployments which attempt to fix DoS issues would do something similar to one of those - temporarily targeting the specific peers causing load (until we can fix the DoS risk) rather than something as broad as a total peer count limit (though that's also straightforward to implement in a custom message handler, if really needed).

As for different node deployments, I'm not sure its as relevant here as you're thinking - basically the only nodes that should be accepting incoming connections are routing nodes that should be very comfortable with 2000+ connections. Maybe there's a difference in that an average routing node would only be comfortable with 1000 connections but a more enterprisey routing node is comfortable with 2500 connections, but is that really worth a config flag? If we just default the config to 1000 would anyone ever actually change it?

I am still worried we're addressing an issue in the wrong place. The reason we have connection limiting in ChannelManager is to avoid disconnecting peers we have channels with while still limiting the connection count to something less than our maximum file descriptors. Maybe we should make that more explicit in channelmanager.rs and actually fetch the file descriptor limit and limit peers to that / 2. We should likely independently add a limit for the number of peers we'll accept gossip from, disconnecting peers that spam us with crap when we didn't request it.

TheBlueMatt avatar Nov 14 '25 15:11 TheBlueMatt

Also happy to hop on a call and discuss this live rather than going back and forth endlessly in messages.

TheBlueMatt avatar Nov 14 '25 15:11 TheBlueMatt