bips
bips copied to clipboard
BIP 331: Ancestor Package Relay
ML thread: https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-May/020493.html
Supersedes #1324.
Assigned BIP # 331
Updated for number
I'm not a big mempool expert, but this seems like a good direction.
-
There are obvious typos and misalignments you're probably aware of, i assume we shouldn't merge them?
Package Information Only
This example seems confusing: it violates the following rule A node MUST NOT send a "ancpkginfo" message that has not been requested by the recipient. Upon receipt of an unsolicited "ancpkginfo", a node may disconnect the sender.
It will likely work if you expand the example to 3 transactions?
-
Unclear what to do when transactions are replaced in the middle of this process. Should that be covered? Probably will become clearer when you prototype.
-
Same about timestamping concerns i guess.
-
As a style comment, I'd suggest doing some compression of the text :)
-
Is Package Erlay possible?
6a) I actually made some attempt to handle parent/child within Erlay, do you think we should exercise the overlap between the two better? E.g., for those packages flowing through the network (not the IBD case, but fee-bump or something) Erlay might do the batching you need. 6b) One might also think about "what if we reconcile packages"? Probably bad idea, because the sets (packages) are too small? My point is that this should not be confused with what you describing. -
MSG_ANCPKGINFO and getpkgtxns have some redundancy in the data... The latter could refer to the hash of the former + the indexes, for example? Not sure savings worth the complexity (how often this will be used? every lightning closure?) but maybe worth making it into q/a :)
Thanks for the review @naumenkogs! yes, I think it's best to talk specifics with the actual implementation but I'll throw some quick answers for now:
Unclear what to do when transactions are replaced in the middle of this process. Should that be covered? Probably will become clearer when you prototype.
One should treat this situation the same way we do for transactions that get replaced in between inv and getdata (I don't see any reason it should be different). Currently, we send "notfound". If in the future we decide to relay recently-replaced transactions, then package relay would include recently-replaced transactions as well. Though I agree this seems more like an implementation question than a spec question. In the spec, I would leave it as "the receiver may respond with notfound."
I actually made some attempt to handle parent/child within Erlay, do you think we should exercise the overlap between the two better? E.g., for those packages flowing through the network (not the IBD case, but fee-bump or something) Erlay might do the batching you need.
Yes it's great that Erlay orders them correctly. Note below-fee-filter transactions to the reconciliation bucket so not every case is handled. I plan to have a rebase-on-top-of-Erlay version so we can exercise the two together.
One might also think about "what if we reconcile packages"? Probably bad idea, because the sets (packages) are too small? My point is that this should not be confused with what you describing.
Mostly the "Is Package Erlay possible" section was to clarify that they aren't mutually exclusive. Reconciling packages is I think something that may come up when we get to sender-initiated package relay. I'm thinking, after reconciliation, it may be useful to send a package inv instead of individual tx invs if there are fee-bumps... But for now, since packages must be requested by the receiver, we don't need to think about that yet.
MSG_ANCPKGINFO and getpkgtxns have some redundancy in the data... The latter could refer to the hash of the former + the indexes, for example?
Yeah that's an interesting idea, we could do a bitfield of 4 bytes (should cover any package since max count is 25). However this means you can only request getpkgtxns from somebody who sent an ancpkginfo (currently this is not a requirement). It also requires the sender to keep track of (allocate memory for, remember to expire) what they announced so they know what "the ith transaction from this package" means. Thoughts?
Though I agree this seems more like an implementation question than a spec question. In the spec, I would leave it as "the receiver may respond with notfound."
The thing with current notfound
is that it's atomic I guess — makes it easy to consider every announcement independent and our code treats it as such (in a poor way but there are also no expectations). With packages, I imagine a CPFP transaction gets replaced and dropped from the package due. A package recipient would then do what, drop the entire package, or consider it invalid?
A requestor will get this package which doesn't pass the fees, and drop it on the floor? Or should it re-request the package? At the moment, I'm not interested in the final answer — but interested in making sure we don't miss such questions.
Reconciling packages is I think something that may come up when we get to sender-initiated package relay. I'm thinking, after reconciliation, it may be useful to send a package inv instead of individual tx invs if there are fee-bumps... But for now, since packages must be requested by the receiver, we don't need to think about that yet.
Not sure this anything to do with who initiates packages. Let's think about the current proposal. Imagine, upon the announcement of C, the receiver sends getdata(MSG_ANCPKGINFO, C), along with some compressed data saying that he already has 1 parent and 1 child of C (by just looking at the wtxid). Then it would save us announcing these 2 wtxids in the reverse direction.
Yeah that's an interesting idea, we could do a bitfield of 4 bytes (should cover any package since max count is 25). However this means you can only request getpkgtxns from somebody who sent an ancpkginfo (currently this is not a requirement). It also requires the sender to keep track of (allocate memory for, remember to expire) what they announced so they know what "the ith transaction from this package" means. Thoughts?
I don't think memory allocation and stuff is a show-stopper, but it might be too much complexity if the gain is low. We should measure how much savings it allows... I think saving 28 byte per announced tx might be worthy. As for the "same peer", do you ever expect packages to be requested from a different peer from the one announced them?
@jnewbery thanks, took all suggestions
CI seems to be failing on bip-0327 title being too long, so will ignore for now.
Sorry about that, the BIP-327 title issue has been fixed, so please rebase.
Thanks @jnewbery @stickies-v @willcl-ark for the feedback. I'm working on incorporating the suggestions, apologies for the delay.
Changed the version
field in "sendpackages" to versions
, now interpreted as a bitfield, and updated the negotiation procedure accordingly. (https://github.com/bitcoin/bips/pull/1382#discussion_r1173605835)
Swapped the inv values for MSG_ANCPKGINFO
and MSG_PKGTXNS
(https://github.com/bitcoin/bips/pull/1382#discussion_r1156267093)
Neither of these are implemented in the branch I had linked before, so I've removed that link. Will update when I've done that.
I've restructured the BIP to make the difference between "generic" (can be used in future versions of package relay) vs "ancestor info" (first defined version) parts more clear (https://github.com/bitcoin/bips/pull/1382#discussion_r1155782599). Also updated the diagrams and took wording suggestions.
Maybe it's obvious or I am missing something but I couldn't see the answer in the doc: How would a node decide between sending Inv(PKGINFO, C)
or Inv(WTX, C)
in the Generic Package Relay example graphic? Is it always sending PKGINFO
if the peer wants packages and C is part of a package?
@fjahr "The inv type should not be used in announcements" it's a type used for getdata
only IIUC
@fjahr "The inv type should not be used in announcements" it's a type used for
getdata
only IIUC
Hmm, ok, first insight: the inconsistencies in the message naming can be confusing. I guess it gets obvious when someone goes deeper but I did a pretty superficial pass and then tried to find an answer to this question and I guess I stumbled over that it's PKGINFO
in the graphics but ANCPKGINFO
in the text.
It seems that the name has changed before which makes researching past discussions on the mailing list harder. Maybe that could be noted somewhere (the artist formerly known as PKGINFO
, also formerly know as MSG_PCKG1
etc.). Not sure it's worth the hassle but in this case it would have helped me.
In an old ML post I found:
''Q: Under what circumstances should a sender announce a
child-with-unconfirmed-parents package?''
A child-with-unconfirmed-parents package for a transaction should be
announced when it meets the peer's fee filter but one or more of its
parents don't; a "inv(MSG_PCKG1)" instead of "inv(WTX)" should be sent
for the child. Each of the parents which meet the peer's fee filter
should still be announced normally.
(from https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-May/020493.html)
But this is now changed it seems so maybe the graphics can be updated in this regard as well and the inv messages removed?
Yes, PKGINFO
is not specified anywhere; think of it as an "abstract class." The purpose of that diagram is to show how this protocol might be extended or upgraded in the future, i.e. we can define a new type of PKGINFO
message to be used in invs and/or getdatas, and continue using the same getpkgtxns and pkgtxns message for transaction download. I want to separate those 2 concepts: exchanging package information and exchanging transaction data.
For example, CHUNKPKGINFO
could contain a chunk (with cluster-based mempool), perhaps specifying the chunk fee and size as well. Originally, I proposed a PCKGINFO1
type which was a child and its parents to communicate a fee-bumping relationship.
The reason there is an "inv(PKGINFO)" there is to communicate that, while inv(ANCPKGINFO)
is not allowed, some other kind of inv(PKGINFO)
can be allowed in the future for a sender-initiated protocol.
I will update the diagram to make this more clear, thanks for the feedback and sorry for the confusion!
Linking https://github.com/nostr-protocol/nips/pull/476, nostr-based package relay.
bip-0331/package_cpfp_flow.png <--- looks like receiver is asking for A even though they already have it
bip-0331/package_cpfp_flow.png <--- looks like receiver is asking for A even though they already have it
Thanks for catching!!! Fixed
ACK, thanks for the updates, I find it an easier read now and didn't see any other issues in my latest pass.
One more question: this protocol seems pretty convenient for fingerprinting nodes by giving specific ancestors to different connections and then observing which tx the connections request to complete the package. I don't think this is a huge problem since there are several other options that allow fingerprinting. But I am curious if this has been discussed somewhere in the past or if there are already ideas for mitigations in the implementation (which I haven't fully reviewed yet).
@fjahr can you think of a fundamentally new vectors vs the back and forth orphan fetching? Package fetching seems to be more of a optimization on that. Higher resolution fingerprinting, if it's more reliable I guess :)
@fjahr can you think of a fundamentally new vectors vs the back and forth orphan fetching? Package fetching seems to be more of a optimization on that. Higher resolution fingerprinting, if it's more reliable I guess :)
I don't think it's fundamentally new, that's why I don't think it's a big issue. But I do think it makes the attack more convenient to scale.
What I had in mind: 1. Make an unconfirmed tx chain A through G. 2. Give different connections different combinations of the txs B through F. This should be 5! = 120 unique fingerprints. 3. Send G and respond to the getdata(ANCPKGINFO, G)
to see which tx of B through F the connection is aware of.
@fjahr fingerprinting is a good consideration but neither new nor made easier. An even more convenient approach: make invalid transactions (as many as you want for free), each with different wtxids, and send one to each node. Nodes will cache their unique rejections and not send a getdata on announcement.
El El lun, 5 jun 2023 a las 21:47, Fabian Jahr @.***> escribió:
@fjahr https://github.com/fjahr can you think of a fundamentally new vectors vs the back and forth orphan fetching? Package fetching seems to be more of a optimization on that. Higher resolution fingerprinting, if it's more reliable I guess :)
I don't think it's fundamentally new, that's why I don't think it's a big issue. But I do think it makes the attack more convenient to scale.
What I had in mind: 1. Make an unconfirmed tx chain A through G. 2. Give different connections different combinations of the txs B through F. This should be 5! = 120 unique fingerprints. 3. Send G and respond to the getdata(ANCPKGINFO, G) to see which tx of B through F the connection is aware of.
— Reply to this email directly, view it on GitHub https://github.com/bitcoin/bips/pull/1382#issuecomment-1577450727, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGAEGGKBQPRGH6NQJYKJNG3XJZAVLANCNFSM6AAAAAAREVY6QA . You are receiving this because you authored the thread.Message ID: @.***>
Updated for suggestions from @ajtowns, thanks!
Major changes (still need to update implementation to reflect these):
- Added a
PKG_RELAY_PKGTXNS = 1 << 0
andPKG_RELAY_ANC
is now1 << 1
(signaling both is versions=3). (https://github.com/bitcoin/bips/pull/1382#discussion_r1225156027) - Set maximum 100 wtxids in a
getpkgtxns
request. (https://github.com/bitcoin/bips/pull/1382#discussion_r1228918874)
I think I've addressed all comments now.
Changed combined hash to be single instead of double sha256
re-ACK https://github.com/bitcoin/bips/pull/1382/commits/02ec218c7857ef60914e9a3d383b68caf987f70b
suggested textual additions and change to single-sha2 for combined hash
@luke-jr @kallewoof is anything else needed here?
@luke-jr pinging again, can we get this merged?
pinging again, if any BIP editors are available to have a look?
ACK 632f143fadaed8c91bba12d096e62feb2a308054