BIP442: OP_PAIRCOMMIT
OP_PAIRCOMMIT is the newest member of the LNhance family of opcodes. It provides limited vector commitment functionality in tapscript.
When evaluated, the OP_PAIRCOMMIT instruction:
- pops the top two values off the stack,
- takes the "PairCommit" tagged SHA256 hash of the stack elements,
- pushes the resulting commitment on the top of the stack.
Discussion: https://delvingbitcoin.org/t/op-paircommit-as-a-candidate-for-addition-to-lnhance/1216/12
Switched back to markdown. Header now in BIP-2 format.
The original create date of OP_PAIRCOMMIT is 2024-03-15 this is the latest revision based on feedback from Anthony Towns. https://gist.github.com/moonsettler/d7f1fb88e3e54ee7ecb6d69ff126433b/revisions What date should go to the header?
Added a discussion link to the PR description.
According to BIP 2:
The Created header records the date that the BIP was assigned a number, […]
Has this proposal been sent to the mailing list?
Has this proposal been sent to the mailing list?
~~Not yet. Wanted to get it into an acceptable shape before I post it there.~~
Proposed to the mailing list, waiting for feedback.
It looks like we gonna have to amend the PAIRCOMMIT BIP with some new use cases.
Turns out within certain practical limitations any computational function can be proven out in the form of a merkle tree.
The root hash of the merkle tree represents the function the leaves represent the inputs and output.
Any 32 bit arithmetic function can certainly be proven out with this method.
CAT itself with a limited set of inputs or limited input sizes can be proven out.
At this point it's an open question if this enables new behaviors not enabled by taproot MAST itself?
Special thanks to: @JeremyRubin @Ademan @bigspider
edit:
Alternatively could consider imposing specific script limits that make PAIRCOMMIT explicitly less capable than MAST itself.
I think I've changed my mind a bit. We were talking about computing a merkle tree for f(u32,u32) as if it was trivial but after a quick experiment it seems like that would take hundreds of years to compute (am I being dumb here?) Instead, you can compute mul(u32,u32) -> u32 using 3 mul(u16,u16)s which is feasible to compute. The witness size is worse, ~32 * 32 * 3 = 3072 instead of 32 * 64 * 1 = 2048, but computing the tree for mul(u16,u16) is feasible using a naive algorithm on commodity hardware.
The implication of this is that where a function can be decomposed into operations on smaller inputs, PAIRCOMMIT is massively more feasible to use than encoding things into a tap tree.
I think I've changed my mind a bit. We were talking about computing a merkle tree for
f(u32,u32)as if it was trivial but after a quick experiment it seems like that would take hundreds of years to compute (am I being dumb here?) Instead, you can computemul(u32,u32) -> u32using 3mul(u16,u16)s which is feasible to compute. The witness size is worse, ~32 * 32 * 3 = 3072instead of32 * 64 * 1 = 2048, but computing the tree formul(u16,u16)is feasible using a naive algorithm on commodity hardware.
Arithmetic and bitwise operations where inputs & outputs are small enough, can already be done in Script in cheaper ways. Merkle trees as lookup tables are only interesting for functions that are either extremely complex, or where preimages/images are larger than what Script can work with. Note that you can already do small indexed lookup tables more efficiently by just hard-coding them in Script (that is: push the table on the stack and use OP_PICK to read its entries), and these techniques are widely used (e.g. in BitVM).
The implication of this is that where a function can be decomposed into operations on smaller inputs, PAIRCOMMIT is massively more feasible to use than encoding things into a tap tree.
I think the only substantial difference is that in a Script where you need several lookups, you can do it with Merkle trees, while you can only do a single lookup with a precomputed taptree.
Proving general computation
Merkle trees can be used to prove out computation where the root of the tree represents the function and the leaves represent the inputs and output. There are practical limits to the entropy space for the inputs as it needs to be iterated over and hashed up.
Currently MAST trees can cover 128 bits of entropy space, which is well over the practical limits to iterate over and merklize. Therefore we assume this capability does not materially extend what computations are possible to prove out in bitcoin script. While
OP_PAIRCOMMITis not limited to a height of 128, that should not be practically feasible to utilize.There is a way to reduce the size of the witness for proving out computation, by eliminating the merkle path inclusion proofs, using
OP_CHECKSIGFROMSTACKtogether withOP_PAIRCOMMIT. This method involves deleted key assumptions, most likely using MPC to create an enormous amount of signatures for the stack elements representing the inputs and the output of the function.
Is this correct? Any suggestions? @Ademan @bigspider
The implication of this is that where a function can be decomposed into operations on smaller inputs, PAIRCOMMIT is massively more feasible to use than encoding things into a tap tree.
This is the main open question I believe. does it or does it not practically expand what we can already do? For example using PC to emulate smolCAT and using traditional methods with lookup tables could make 32 bit or even 64 bit arithmetics more feasible?
edit:
Within the 32 bit realm we can already use OP_ADD, I see little practical diff between <0x1234> <0x5678> CAT and <0x12340000> <0x5678> ADD.
And it sounds like 64 bit smolCAT would be way too expensive to generate (and also to interact with trustlessly).
(actually the above examples are wrong, because internally bitcoin script uses little endian, but should convey the point)
...
Arithmetic and bitwise operations where inputs & outputs are small enough, can already be done in Script in cheaper ways. Merkle trees as lookup tables are only interesting for functions that are either extremely complex, or where preimages/images are larger than what Script can work with. Note that you can already do small indexed lookup tables more efficiently by just hard-coding them in Script (that is: push the table on the stack and use OP_PICK to read its entries), and these techniques are widely used (e.g. in BitVM).
Even u16,u16 is quite a bit larger than I think is practical as a lookup table, but the efficiency for repeated operations is constant, obviously. The lookup table is less efficient for small numbers of operations (a u8,u8 table is 16k vs 1 u8,u8 proof is 0.4k) but the merkle tree loses quickly when those operations are repeated.
The implication of this is that where a function can be decomposed into operations on smaller inputs, PAIRCOMMIT is massively more feasible to use than encoding things into a tap tree.
I think the only substantial difference is that in a Script where you need several lookups, you can do it with Merkle trees, while you can only do a single lookup with a precomputed taptree.
Right, and the key point is these merkle trees and lookup tables rapidly become infeasible to compute as the input size grows, so multiple smaller lookups is significantly more useful.
EDIT: But your point is well taken that for smaller operations they can already be better accomplished by lookup tables.
... edit: Within the 32 bit realm we can already use OP_ADD, I see little practical diff between
<0x1234> <0x5678> CATand<0x12340000> <0x5678> ADD. And it sounds like 64 bit smolCAT would be way too expensive to generate (and also to interact with trustlessly).(actually the above examples are wrong, because internally bitcoin script uses little endian, but should convey the point)
Yeah for arbitrary 8 byte strings smolCAT seems infeasible to compute the table or merkle tree for. After a bit of conversation on IRC it could probably be feasible for arbitrary f(b[4],b[4]) -> b[8] with a custom ASIC¹ or maybe a cluster of FPGAs in a span of ~a few years but that would not be very useful for the average person.
Bit shifts over 32 bit integers seems pretty feasible though, that's f(u32,u6)->u32 (maybe save some space by special casing shift = 0). it seems like my incredibly naive, unoptimized, single-core experiment could calculate that merkle tree in ~96 hours. Of course the proof is ~1.2k and users would likely need multiple, but the lookup table for that wouldn't fit in a block anyway so maybe something new is possible?
You can also separate positive and negative shifts, and maybe break it down into multiple rounds of shifts 1-3 or something (or 1k for a proof for a constant shift)
[1]: afaik existing ASICs operate on block headers so couldn't help
I think this BIP is already way more verbose than it was supposed to be.
It would be useful if we could reference it by a number. Can we get a BIP number assigned? Not asking for a merge yet.
Should we add this table to this BIP? And it's not just vBytes but also the number of sigops to consider, which is a cost all nodes on the p2p network have to bear.
edit: I think it looks like this:
| Method | ChannelS | UpdateSc | UpdateWi | 1-Update | 2-Update |
|---|---|---|---|---|---|
| APO-annex | 2.25 vB | 28.25 vB | 25 vB | 305 vB | 461.5 vB |
| APO-return | 2.25 vB | 28.25 vB | 16.5 vB | 338.5 vB | 528.5 vB |
| CTV+CSFS+IKEY | 2.75 vB | 12.25 vB | 24.5 vB | 331 vB | 513 vB |
| CTV+CSFS | 11 vB | 20.5 vB | 24.5 vB | 347.5 vB | 537.75 vB |
| LNhance | 3 vB | 12.5 vB | 32.75 vB | 297.75 vB | 446.25 vB |
| CTV+CSFS+VAULT | 18.75 vB | 30 vB | 35 vB | 333.75 vB | 502.25 vB |
| rekey | 7.25 vB | 16.75 vB | 73.75 vB | 347.25 vB | 541 vB |
| Method | ForceC | Update | Settle | OP_RETURN |
|---|---|---|---|---|
| APO-annex | 1 SigOp | 1 SigOp | 1 SigOp | |
| APO-return | 1 SigOp | 1 SigOp | 1 SigOp | X |
| CTV+CSFS+IKEY | 1 SigOp | 1 SigOp | CTV | X |
| CTV+CSFS | 1 SigOp | 1 SigOp | CTV | X |
| LNhance | 1 SigOp | 1 SigOp | CTV | |
| CTV+CSFS+VAULT | 2* SigOp | 2* SigOp | CTV | |
| rekey | 3 SigOp | 3 SigOp | CTV |
* VAULT is not exactly a SigOp, but close enough. Has a budget cost of 1.2 SigOps.
tbh i have no idea how to read that table, so might be good to have clearer labeling somehow / break down where the accounting came from?
This is based on @reardencode's spreadsheet: https://docs.google.com/spreadsheets/d/1UNW1AV7F8Srf-vHL5F3bKnkezbHIMabQqhU21tV915o/edit?gid=0#gid=0
It would be useful if we could reference it by a number. Can we get a BIP number assigned? Not asking for a merge yet.
Looking at the Motivation section, it seems to me that the main application for this proposal would be a construction that depends on three other undeployed proposals some of which are themselves draft stage or pre-draft. This proposal feels a bit hypothetical at this point. I’ll get back to you next week.
This proposal feels a bit hypothetical at this point.
While PAIRCOMMIT would only be truly useful with certain other future upgrades, it is proposed to activate in a bundle with said updates. We are in the process of trying to reach consensus on said package.
It's unlikely I would withdraw OP_PAIRCOMMIT, unless OP_CAT got activated first.
What date should go to the header?
The Created header records the date that the BIP was assigned a number (see BIP-2).
I get that there is a goal here to avoid introspection...
but it seems that it'd be more generically useful if the function were e.g., a TapBranch function, so then it could be used in the future with some other taproot editing opcodes.
e.g., if OP_TAPBRANCHCOMMIT were to lexicographic sort, then cat, then commit, you could do paircommit like functionality by doing:
<a> <b> TUCK OP_TAPBRANCHCOMMIT OP_TAPBRANCHCOMMIT
this works because while the first commit commits to either a||b or b||a, and also has sliceable errors (e.g., (a||b)[:i] and (a||b)[i:]), the second commit re-commits to <b> and it's length which uniquely determines order and prevents sliceable errors.
One concern: OP_TAPBRANCHCOMMIT is witness "malleable", in that items could show up in the witness stack in either order and get the same result. It'd still be possible to make non-malleable witnesses by requiring the stack elements to be in order with a OP_LEXSORT or equivalent functionality.
If we did OP_TAPBRANCHCOMMIT then with OP_CHECKCONTRACTVERIFY I believe you get TLUV behavior?
You can verify a tapleaf on the input and can also replace it with an other on the output?
To be honest the OP_PAIRCOMMIT domain separation was aimed to prevent the possibility of such uses as potentially being more controversial.
I think it's actually less controversial, because if you do OP_TAPBRANCHCOMMIT you're doing something that even once you have OP_CAT, is really handy to have (because sorting two strings is actually still pretty hard with CAT).
Whereas now you're getting a lot of people thinking that paircommit is not so useful if CAT gets in eventually.
Right. I did consider something like a sorting merkle operator for lamport stuff for example, however when you need order dependent commitments (which is pretty much everything we are doing with it rn) then you would have to use a dummy value like:
<a> <1> <b> OP_TAPBRANCHCOMMIT OP_TAPBRANCHCOMMIT
or maybe:
<a> <b> OP_SHA256 OP_TAPBRANCHCOMMIT
instead of:
<a> <b> OP_PAIRCOMMIT
So if I wanted to support both of these functionalities I would probably end up with 2 separate opcodes for them.
I think this is ready for a merge from my point of view. Rebased to latest master and squashed all the changes to 1 commit.
Maybe this is obvious to everyone else and I'm a slowpoke here, but PC+CSFS+CTV enables a kind of "multi-transaction-signature". You sign a merkle root that commits to a bunch of CTV transaction templates, and you can provide an inclusion proof with PC with the signature.
I've been spitballing with someone about a multiparty eltoo scheme and came up with a way to bound the state resolution but without state carrying covenants it requires a possible field of ~2^N transactions (of which, at most N will end up on-chain), and every party shares N*(N+1)/2 signatures.
With PC+CSFS+CTV it still requires ~2^N transactions, but each party only needs to share their 1 signature over the merkle root.
EDIT: Not 100% sure it works for my mutliparty eltoo scheme just yet, but I'm still >50% confident it does... EDIT EDIT: More confident it works, but it looks like the number of transactions is exponential rather than polynomial... and so my estimate for how many signatures would need to be shared without the "multi-transaction-signature" commitment is probably way low. EDIT EDIT EDIT: I genuinely have no idea how I decided "multi-signature" was an acceptable term considering it already has a different meaning in Bitcoin... s/"multi-signature"/"multi-transaction-signature"/g
Thanks so much Jon. we have a significant revision coming that does include defining abbreviations. @moonsettler should we move this to draft until we finish workshopping the revisions we're working on?
We have a rework of the BIP, the original can be viewed here
Please provide feedback on which approach you think is better! @murchandamus @jonatack
Hey, I see that there was a force push, but it is not clear to me whether the review comments have been addressed. Looking at a couple at least some of them seem to still apply. Could you please process the review comments, either reply on them if there is information to share or the conversation is still on-going, or resolve them if you have applied them or rejected them? Please mark this pull request as "Ready for review" when you would like another editor review.
Hey, I see that there was a force push, but it is not clear to me whether the review comments have been addressed. Looking at a couple at least some of them seem to still apply. Could you please process the review comments, either reply on them if there is information to share or the conversation is still on-going, or resolve them if you have applied them or rejected them? Please mark this pull request as "Ready for review" when you would like another editor review.
Sorry, I don't think we addressed everything. Should go through every suggestion one by one. Right now I can't prioritize this sadly. Not sure if @reardencode wishes to pick up the slack in the meantime.