bips icon indicating copy to clipboard operation
bips copied to clipboard

BIP 331: Ancestor Package Relay

Open glozow opened this issue 2 years ago • 25 comments

ML thread: https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-May/020493.html

Supersedes #1324.

glozow avatar Oct 13 '22 21:10 glozow

Assigned BIP # 331

luke-jr avatar Oct 21 '22 12:10 luke-jr

Updated for number

glozow avatar Oct 24 '22 13:10 glozow

I'm not a big mempool expert, but this seems like a good direction.


  1. There are obvious typos and misalignments you're probably aware of, i assume we shouldn't merge them?

Package Information Only

This example seems confusing: it violates the following rule A node MUST NOT send a "ancpkginfo" message that has not been requested by the recipient. Upon receipt of an unsolicited "ancpkginfo", a node may disconnect the sender. It will likely work if you expand the example to 3 transactions?

  1. Unclear what to do when transactions are replaced in the middle of this process. Should that be covered? Probably will become clearer when you prototype.

  2. Same about timestamping concerns i guess.

  3. As a style comment, I'd suggest doing some compression of the text :)

  4. Is Package Erlay possible? 6a) I actually made some attempt to handle parent/child within Erlay, do you think we should exercise the overlap between the two better? E.g., for those packages flowing through the network (not the IBD case, but fee-bump or something) Erlay might do the batching you need. 6b) One might also think about "what if we reconcile packages"? Probably bad idea, because the sets (packages) are too small? My point is that this should not be confused with what you describing.

  5. MSG_ANCPKGINFO and getpkgtxns have some redundancy in the data... The latter could refer to the hash of the former + the indexes, for example? Not sure savings worth the complexity (how often this will be used? every lightning closure?) but maybe worth making it into q/a :)

naumenkogs avatar Dec 23 '22 10:12 naumenkogs

Thanks for the review @naumenkogs! yes, I think it's best to talk specifics with the actual implementation but I'll throw some quick answers for now:

Unclear what to do when transactions are replaced in the middle of this process. Should that be covered? Probably will become clearer when you prototype.

One should treat this situation the same way we do for transactions that get replaced in between inv and getdata (I don't see any reason it should be different). Currently, we send "notfound". If in the future we decide to relay recently-replaced transactions, then package relay would include recently-replaced transactions as well. Though I agree this seems more like an implementation question than a spec question. In the spec, I would leave it as "the receiver may respond with notfound."

I actually made some attempt to handle parent/child within Erlay, do you think we should exercise the overlap between the two better? E.g., for those packages flowing through the network (not the IBD case, but fee-bump or something) Erlay might do the batching you need.

Yes it's great that Erlay orders them correctly. Note below-fee-filter transactions to the reconciliation bucket so not every case is handled. I plan to have a rebase-on-top-of-Erlay version so we can exercise the two together.

One might also think about "what if we reconcile packages"? Probably bad idea, because the sets (packages) are too small? My point is that this should not be confused with what you describing.

Mostly the "Is Package Erlay possible" section was to clarify that they aren't mutually exclusive. Reconciling packages is I think something that may come up when we get to sender-initiated package relay. I'm thinking, after reconciliation, it may be useful to send a package inv instead of individual tx invs if there are fee-bumps... But for now, since packages must be requested by the receiver, we don't need to think about that yet.

MSG_ANCPKGINFO and getpkgtxns have some redundancy in the data... The latter could refer to the hash of the former + the indexes, for example?

Yeah that's an interesting idea, we could do a bitfield of 4 bytes (should cover any package since max count is 25). However this means you can only request getpkgtxns from somebody who sent an ancpkginfo (currently this is not a requirement). It also requires the sender to keep track of (allocate memory for, remember to expire) what they announced so they know what "the ith transaction from this package" means. Thoughts?

glozow avatar Jan 09 '23 11:01 glozow

Though I agree this seems more like an implementation question than a spec question. In the spec, I would leave it as "the receiver may respond with notfound."

The thing with current notfound is that it's atomic I guess — makes it easy to consider every announcement independent and our code treats it as such (in a poor way but there are also no expectations). With packages, I imagine a CPFP transaction gets replaced and dropped from the package due. A package recipient would then do what, drop the entire package, or consider it invalid?

A requestor will get this package which doesn't pass the fees, and drop it on the floor? Or should it re-request the package? At the moment, I'm not interested in the final answer — but interested in making sure we don't miss such questions.

Reconciling packages is I think something that may come up when we get to sender-initiated package relay. I'm thinking, after reconciliation, it may be useful to send a package inv instead of individual tx invs if there are fee-bumps... But for now, since packages must be requested by the receiver, we don't need to think about that yet.

Not sure this anything to do with who initiates packages. Let's think about the current proposal. Imagine, upon the announcement of C, the receiver sends getdata(MSG_ANCPKGINFO, C), along with some compressed data saying that he already has 1 parent and 1 child of C (by just looking at the wtxid). Then it would save us announcing these 2 wtxids in the reverse direction.

Yeah that's an interesting idea, we could do a bitfield of 4 bytes (should cover any package since max count is 25). However this means you can only request getpkgtxns from somebody who sent an ancpkginfo (currently this is not a requirement). It also requires the sender to keep track of (allocate memory for, remember to expire) what they announced so they know what "the ith transaction from this package" means. Thoughts?

I don't think memory allocation and stuff is a show-stopper, but it might be too much complexity if the gain is low. We should measure how much savings it allows... I think saving 28 byte per announced tx might be worthy. As for the "same peer", do you ever expect packages to be requested from a different peer from the one announced them?

naumenkogs avatar Jan 09 '23 13:01 naumenkogs

@jnewbery thanks, took all suggestions

CI seems to be failing on bip-0327 title being too long, so will ignore for now.

glozow avatar Mar 28 '23 10:03 glozow

Sorry about that, the BIP-327 title issue has been fixed, so please rebase.

kallewoof avatar Mar 30 '23 01:03 kallewoof

Thanks @jnewbery @stickies-v @willcl-ark for the feedback. I'm working on incorporating the suggestions, apologies for the delay.

glozow avatar Apr 05 '23 15:04 glozow

Changed the version field in "sendpackages" to versions, now interpreted as a bitfield, and updated the negotiation procedure accordingly. (https://github.com/bitcoin/bips/pull/1382#discussion_r1173605835) Swapped the inv values for MSG_ANCPKGINFO and MSG_PKGTXNS (https://github.com/bitcoin/bips/pull/1382#discussion_r1156267093)

Neither of these are implemented in the branch I had linked before, so I've removed that link. Will update when I've done that.

I've restructured the BIP to make the difference between "generic" (can be used in future versions of package relay) vs "ancestor info" (first defined version) parts more clear (https://github.com/bitcoin/bips/pull/1382#discussion_r1155782599). Also updated the diagrams and took wording suggestions.

glozow avatar Apr 22 '23 16:04 glozow

Maybe it's obvious or I am missing something but I couldn't see the answer in the doc: How would a node decide between sending Inv(PKGINFO, C) or Inv(WTX, C) in the Generic Package Relay example graphic? Is it always sending PKGINFO if the peer wants packages and C is part of a package?

fjahr avatar May 01 '23 16:05 fjahr

@fjahr "The inv type should not be used in announcements" it's a type used for getdata only IIUC

instagibbs avatar May 02 '23 14:05 instagibbs

@fjahr "The inv type should not be used in announcements" it's a type used for getdata only IIUC

Hmm, ok, first insight: the inconsistencies in the message naming can be confusing. I guess it gets obvious when someone goes deeper but I did a pretty superficial pass and then tried to find an answer to this question and I guess I stumbled over that it's PKGINFO in the graphics but ANCPKGINFO in the text.

It seems that the name has changed before which makes researching past discussions on the mailing list harder. Maybe that could be noted somewhere (the artist formerly known as PKGINFO, also formerly know as MSG_PCKG1 etc.). Not sure it's worth the hassle but in this case it would have helped me.

In an old ML post I found:

''Q: Under what circumstances should a sender announce a
child-with-unconfirmed-parents package?''

A child-with-unconfirmed-parents package for a transaction should be
announced when it meets the peer's fee filter but one or more of its
parents don't; a "inv(MSG_PCKG1)" instead of "inv(WTX)" should be sent
for the child. Each of the parents which meet the peer's fee filter
should still be announced normally.

(from https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-May/020493.html)

But this is now changed it seems so maybe the graphics can be updated in this regard as well and the inv messages removed?

fjahr avatar May 02 '23 15:05 fjahr

Yes, PKGINFO is not specified anywhere; think of it as an "abstract class." The purpose of that diagram is to show how this protocol might be extended or upgraded in the future, i.e. we can define a new type of PKGINFO message to be used in invs and/or getdatas, and continue using the same getpkgtxns and pkgtxns message for transaction download. I want to separate those 2 concepts: exchanging package information and exchanging transaction data.

For example, CHUNKPKGINFO could contain a chunk (with cluster-based mempool), perhaps specifying the chunk fee and size as well. Originally, I proposed a PCKGINFO1 type which was a child and its parents to communicate a fee-bumping relationship.

The reason there is an "inv(PKGINFO)" there is to communicate that, while inv(ANCPKGINFO) is not allowed, some other kind of inv(PKGINFO) can be allowed in the future for a sender-initiated protocol.

I will update the diagram to make this more clear, thanks for the feedback and sorry for the confusion!

glozow avatar May 03 '23 10:05 glozow

Linking https://github.com/nostr-protocol/nips/pull/476, nostr-based package relay.

joostjager avatar May 05 '23 14:05 joostjager

bip-0331/package_cpfp_flow.png <--- looks like receiver is asking for A even though they already have it

instagibbs avatar May 30 '23 17:05 instagibbs

bip-0331/package_cpfp_flow.png <--- looks like receiver is asking for A even though they already have it

Thanks for catching!!! Fixed

glozow avatar May 31 '23 11:05 glozow

ACK, thanks for the updates, I find it an easier read now and didn't see any other issues in my latest pass.

One more question: this protocol seems pretty convenient for fingerprinting nodes by giving specific ancestors to different connections and then observing which tx the connections request to complete the package. I don't think this is a huge problem since there are several other options that allow fingerprinting. But I am curious if this has been discussed somewhere in the past or if there are already ideas for mitigations in the implementation (which I haven't fully reviewed yet).

fjahr avatar Jun 05 '23 20:06 fjahr

@fjahr can you think of a fundamentally new vectors vs the back and forth orphan fetching? Package fetching seems to be more of a optimization on that. Higher resolution fingerprinting, if it's more reliable I guess :)

instagibbs avatar Jun 05 '23 20:06 instagibbs

@fjahr can you think of a fundamentally new vectors vs the back and forth orphan fetching? Package fetching seems to be more of a optimization on that. Higher resolution fingerprinting, if it's more reliable I guess :)

I don't think it's fundamentally new, that's why I don't think it's a big issue. But I do think it makes the attack more convenient to scale.

What I had in mind: 1. Make an unconfirmed tx chain A through G. 2. Give different connections different combinations of the txs B through F. This should be 5! = 120 unique fingerprints. 3. Send G and respond to the getdata(ANCPKGINFO, G) to see which tx of B through F the connection is aware of.

fjahr avatar Jun 05 '23 20:06 fjahr

@fjahr fingerprinting is a good consideration but neither new nor made easier. An even more convenient approach: make invalid transactions (as many as you want for free), each with different wtxids, and send one to each node. Nodes will cache their unique rejections and not send a getdata on announcement.

El El lun, 5 jun 2023 a las 21:47, Fabian Jahr @.***> escribió:

@fjahr https://github.com/fjahr can you think of a fundamentally new vectors vs the back and forth orphan fetching? Package fetching seems to be more of a optimization on that. Higher resolution fingerprinting, if it's more reliable I guess :)

I don't think it's fundamentally new, that's why I don't think it's a big issue. But I do think it makes the attack more convenient to scale.

What I had in mind: 1. Make an unconfirmed tx chain A through G. 2. Give different connections different combinations of the txs B through F. This should be 5! = 120 unique fingerprints. 3. Send G and respond to the getdata(ANCPKGINFO, G) to see which tx of B through F the connection is aware of.

— Reply to this email directly, view it on GitHub https://github.com/bitcoin/bips/pull/1382#issuecomment-1577450727, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGAEGGKBQPRGH6NQJYKJNG3XJZAVLANCNFSM6AAAAAAREVY6QA . You are receiving this because you authored the thread.Message ID: @.***>

glozow avatar Jun 06 '23 10:06 glozow

Updated for suggestions from @ajtowns, thanks!

Major changes (still need to update implementation to reflect these):

  • Added a PKG_RELAY_PKGTXNS = 1 << 0 and PKG_RELAY_ANC is now 1 << 1 (signaling both is versions=3). (https://github.com/bitcoin/bips/pull/1382#discussion_r1225156027)
  • Set maximum 100 wtxids in a getpkgtxns request. (https://github.com/bitcoin/bips/pull/1382#discussion_r1228918874)

glozow avatar Jun 14 '23 14:06 glozow

I think I've addressed all comments now.

glozow avatar Jul 04 '23 11:07 glozow

Changed combined hash to be single instead of double sha256

glozow avatar Aug 07 '23 09:08 glozow

re-ACK https://github.com/bitcoin/bips/pull/1382/commits/02ec218c7857ef60914e9a3d383b68caf987f70b

suggested textual additions and change to single-sha2 for combined hash

instagibbs avatar Aug 08 '23 14:08 instagibbs

@luke-jr @kallewoof is anything else needed here?

glozow avatar Jan 17 '24 11:01 glozow

@luke-jr pinging again, can we get this merged?

glozow avatar Apr 11 '24 10:04 glozow

pinging again, if any BIP editors are available to have a look?

glozow avatar Apr 24 '24 16:04 glozow

ACK 632f143fadaed8c91bba12d096e62feb2a308054

jonatack avatar Apr 24 '24 18:04 jonatack