bitcoin icon indicating copy to clipboard operation
bitcoin copied to clipboard

addrman: delete addresses that don't belong to the supported networks

Open brunoerg opened this issue 2 years ago • 10 comments

This PR adds a new flag -cleanupaddrman. When initializing a node with this flag, it will delete addresses from addrman (both new and tried tables) that do not belong to the supported networks (e.g. -onlynet). Deleting addresses that do not belong to the supported networks can avoid a lot of unnecessary addrman Select calls (especially when turning from clearnet + Tor/I2P/CJDNS to Tor/I2P/CJDNS).

  1. This flag is not enabled by default, if user intends to go back to the previous network, just do not set it.
  2. Addresses from non-supported networks will naturely be replaced in addrman since we only store addresses from networks we support (cleaning up them on init is a way to avoid spending resources on it).
  3. Avoiding these addresses in addrman can avoid exceeding the maximum number of tries in ThreadOpenConnections.
// If we didn't find an appropriate destination after trying 100 addresses fetched from addrman,
// stop this loop, and let the outer loop run again (which sleeps, adds seed nodes, recalculates
// already-connected network ranges, ...) before trying new addrman addresses.
nTries++;
if (nTries > 100)
    break;
  1. Specially turning from clearnet + Tor/I2P/CJDNS to Tor/I2P/CJDNS, this feature it will ensure we will relay more addresses from these networks to other peers.

brunoerg avatar Jan 26 '24 20:01 brunoerg

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK 1440000bytes

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #29415 (Broadcast own transactions only via short-lived Tor or I2P connections by vasild)
  • #28960 (kernel: Remove dependency on CScheduler by TheCharlatan)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

DrahtBot avatar Jan 26 '24 20:01 DrahtBot

cc: @mzumsande

brunoerg avatar Jan 26 '24 21:01 brunoerg

🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

Leave a comment here, if you need help tracking down a confusing failure.

Debug: https://github.com/bitcoin/bitcoin/runs/20916324600

DrahtBot avatar Jan 26 '24 22:01 DrahtBot

Having these records around might be useful too, if you go back to those networks.

Do we have any practical problems with these Select calls? Too much CPU time, orz disk reads, or memory use?

Not only this would motivate this code change and review use better, but also it would help us understand under which circumstances a node operator might actually want to do this....And then maybe the right thing would be to trigger this cleaning automatically rather than thinking a user will trigger it.

Or, what i think is even more likely, we may rather change Addrman architecture to facilitate this better in the first place.

naumenkogs avatar Jan 29 '24 09:01 naumenkogs

+1 to the questions asked by @naumenkogs. I agree with the statement in the OP that this change would "avoid a lot of unnecessary addrman Select calls (especially when...", but would like additional context as to how that would tangibly impact the node operations or end user.

in general I am hesitant to add init params since it increases the combinatoric space of startup possibilities & they are hard to deprecate over time. maybe one idea (if we want to support this feature) could be to add a param to -onlynet instead of a whole new startup option.

amitiuttarwar avatar Jan 31 '24 22:01 amitiuttarwar

This PR adds a new flag -cleanupaddrman. When initializing a node with this flag, it will delete addresses from addrman (both new and tried tables) that do not belong to the supported networks (e.g. -onlynet).

I don't think this is a good idea

This flag is not enabled by default.

Concept ACK because it's not enabled by default and could be useful for testing.

1440000bytes avatar Feb 01 '24 01:02 1440000bytes

@naumenkogs and @amitiuttarwar I just updated the description with more infos. Thanks!


Or, what i think is even more likely, we may rather change Addrman architecture to facilitate this better in the first place.

@naumenkogs: What do you mean by changing addrman architecture?

brunoerg avatar Feb 02 '24 20:02 brunoerg

Deleting addresses that do not belong to the supported networks can avoid a lot of unnecessary addrman Select calls

It would be good to have some intution for how big this effect is. For example, if you have an addrman with mostly IPv4/IPv6 addresses and a few onion ones then restart with -onlynet=onion, will it take noticeably longer to find peers? The worst case scenario would be if you had just a single onion address, but that doesn't seem realistic, so I'd be more interested about the case where you have between 5% and 10% onion addresses.

mzumsande avatar Feb 06 '24 18:02 mzumsande

What about setting preferred_net

https://github.com/bitcoin/bitcoin/blob/7143d4388407ab3d12005e55a02d5e8f334e4dc9/src/net.cpp#L2682

if -onlynet is used? Surely it is a waste to select an unreachable address:

https://github.com/bitcoin/bitcoin/blob/7143d4388407ab3d12005e55a02d5e8f334e4dc9/src/net.cpp#L2695-L2697

That would skew the outcome though. If addrman has 60% IPv4, 25% Tor and 15% I2P addresses right now it is more likely to select Tor over I2P. If we run with -onlynet=tor -onlynet=i2p and set preferred_net to Tor or I2P randomly (50% chance) to avoid the IPv4 addresses, then it will not be more likely to pick Tor. It would be relatively easy to extend that parameter to preferred_nets -- meaning to return an address from one of the given networks.

  1. Specially turning from clearnet + Tor/I2P/CJDNS to Tor/I2P/CJDNS, this feature it will ensure we will relay more addresses from these networks to other peers.

clearnet addresses might be useful to Tor peers if they are multi-homed. I think it is good that we send any address type to any peer. Instead of e.g. only Tor addresses to Tor peers.

vasild avatar Feb 11 '24 18:02 vasild

What about setting preferred_net

Yes, I previously discussed this options with @mzumsande. It might be an alternative, sure.

That would skew the outcome though. If addrman has 60% IPv4, 25% Tor and 15% I2P addresses right now it is more likely to select Tor over I2P. If we run with -onlynet=tor -onlynet=i2p and set preferred_net to Tor or I2P randomly (50% chance) to avoid the IPv4 addresses, then it will not be more likely to pick Tor. It would be relatively easy to extend that parameter to preferred_nets -- meaning to return an address from one of the given networks.

Sure, we can extend addrman's Select to deal with more than one network and return an address from one of the given networks.

clearnet addresses might be useful to Tor peers if they are multi-homed. I think it is good that we send any address type to any peer. Instead of e.g. only Tor addresses to Tor peers.

Sure, this is something we can dicuss because not ensuring we're sending Tor addresses to Tor peers is also not positive. But I agree on relaying clearnet addresses for those nodes. I do not have this data, but I suppose that most Tor peers also run on clearnet?

brunoerg avatar Feb 12 '24 19:02 brunoerg

not ensuring we're sending Tor addresses to Tor peers is also not positive

Right, if we sent solely IPv4 addresses to a Tor peer while we have Tor addresses, that seems like something that can be improved.

vasild avatar Feb 13 '24 11:02 vasild

I'll close this PR to work on a "less aggressive" approach (extending Select to support multiple networks). I will provide my benchmarks on that new PR.

Thanks, @vasild @mzumsande @naumenkogs @amitiuttarwar.

brunoerg avatar Feb 13 '24 21:02 brunoerg