rust-libp2p
rust-libp2p copied to clipboard
feat(kad): improve options to efficiently retrieve providers
Still a draft as I am opening this to discuss the details and direction of this.
Description
The end goal is to allow the following
- when calling
get_providers
give the caller results back in real time, not collect all provider records - when calling
get_providers
allow the total number of records to be of a fixed limit, such that not necessarily 20 peers need to be contacted (as is the case in the current impl)
Implementation
There are currently two different commits, the first one implements the basic limit functionality, but this gets replaced with a more general solution in 2868e7bb14c9b267b2c50d811895f10cf3945b46, which streams the results and allows the caller to .finish()
the query when they have enough.
Things Left To Do
- [ ] decide which queries should be transitioned to the new progress based api
- [ ] potentially remove
Quorum
based on the new functionality, as this now could be done by the caller if using the progression based api - [ ] write changelog
I am guessing that this is already significantly speeding up your query times @dignifiedquire?
yes very much
@kpp @tomaka do you have thoughts on this? This will require a small change in the Authority Discovery module.
@koivunej what do you think of this proposal? I would guess rust-ipfs could benefit from the performance gains as well.
Thanks for the ping @mxinden but I've been out of the loop for a while again so I cannot really comment anything. Will try to keep up to date with this and any follow-up release.
This will require a small change in the Authority Discovery module.
I am not sure how many changes this will require. I looked through the code and found 0 usages of Providers.
This will require a small change in the Authority Discovery module.
I am not sure how many changes this will require. I looked through the code and found 0 usages of Providers.
@kpp the API changes would as well apply to get_record
, which Substrate is using last time I worked on it.
@dignifiedquire let me know once this is ready for another review. Excited for this to eventually land.
I would like to keep the
master
branch consistent at all times. Thus I would prefer the changes to the other query types to happen in this pull request as well. Would you be willing to do that work as well @dignifiedquire?
Yeah, I'll work on it, just wanted to get the structure right first.
@mxinden if we remove the quorum from get_record
, is your expectation that there is no termination condition anymore, and the caller always has to manually terminate, or should there be a default termination?
@mxinden if we remove the quorum from
get_record
, is your expectation that there is no termination condition anymore, and the caller always has to manually terminate, or should there be a default termination?
I would expect the QueryPeerIter
to eventually run out of peers and thus I would expect the query to eventually terminate.
Do you think this is too implicit / not intuitive for users? In other words should we consider the potentially unnecessary work significant and thus the whole API a footgun?
Friendly ping @dignifiedquire. Would be unfortunate for this to go stale. Anything you would need from my end?
Do I understand correctly, that this is the only reason why iroh depends on a fork of rust-libp2p?
Do I understand correctly, that this is the only reason why iroh depends on a fork of rust-libp2p?
yes, that is correct
Do you think this is too implicit / not intuitive for users? In other words should we consider the potentially unnecessary work significant and thus the whole API a footgun?
I am a little afraid that could happen to be honest.
Friendly ping @dignifiedquire. Would be unfortunate for this to go stale. Anything you would need from my end?
Honestly I am not sure how much sense it really makes to change the others behaviour atm, other than the changes I made so far. I will likely have a better understanding down the line when starting to use the rest of the API similarly intensely as I do provider finding atm.
So I would love if we can find something close to what is there that we can merge, and I (or others) can work on improving the other pieces of the API as needed.
Honestly I am not sure how much sense it really makes to change the others behaviour atm, other than the changes I made so far.
Do you see any issues with applying the new mechanism to GetRecord
?
I understand your reasoning on only doing this change to the GetRecord
system once we have a deeper understanding of it. That said, I do want to keep the libp2p-kad
API surface consistent, i.e. not offer two different mechanisms to execute operations on the DHT. I argue that this consistency helps both user experience and maintainability. Thus I suggest to either include the changes across the entire API surface, or delaying this pull request until we have more information, but not merge as is.
@mxinden fair enough, I updated the GetRecord
api, looks pretty alright to me in tests. Do you think that is enough, in terms of updates or do you want any other api calls to be changed to return multiple results?
If I am not mistaken, from a high level, only
FindNode
is missing
I am not sure exactly which public facing api you are referring to with FindNode
?
If I am not mistaken, from a high level, only
FindNode
is missingI am not sure exactly which public facing api you are referring to with
FindNode
?
Sorry, this was confusing. I am referring to Kademlia::get_closest_peers
which sends Kademlia FIND_NODE
messages to nodes in the network.
@dignifiedquire is this ready for another review?
should be now 😅
This pull request has merge conflicts. Could you please resolve them @dignifiedquire? 🙏
Hope to get a full review done tomorrow. That way you don't have to follow up on many (small) comments, but instead one large batch.
Went through all of the changes.
Thank you :) I think addressed all of them now
In my eyes there is still work to do to make the API changes consistent.
In addition to the inline comments you left?
Let me know if you want input on this from me as well, otherwise I'll leave it to you @mxinden :)
@mxinden applied CR & your patches and rebased
@dignifiedquire please don't force push on rust-libp2p pull requests. In particular this makes collaborating on a pull request hard. We squash-merge, thus a clean commit history, while appreciated, isn't necessary.
Sorry about that, force of habit 😓
@mxinden anything I can help with to move this forward?
This pull request has merge conflicts. Could you please resolve them @dignifiedquire? 🙏
@mxinden first pass at changelog done
Some small suggestions
thanks a lot, merged them all
Hmm, this is quite a lot to read: now that I’m facing a somewhat non-obvious upgrade for ipfs-embed (as I’ve never used Kademlia), what is the upgrade path for the quorum removal from get_record
? @mxinden @dignifiedquire