bips icon indicating copy to clipboard operation
bips copied to clipboard

BIP442: OP_PAIRCOMMIT

Open moonsettler opened this issue 1 year ago • 42 comments

OP_PAIRCOMMIT is the newest member of the LNhance family of opcodes. It provides limited vector commitment functionality in tapscript.

When evaluated, the OP_PAIRCOMMIT instruction:

  • pops the top two values off the stack,
  • takes the "PairCommit" tagged SHA256 hash of the stack elements,
  • pushes the resulting commitment on the top of the stack.

Discussion: https://delvingbitcoin.org/t/op-paircommit-as-a-candidate-for-addition-to-lnhance/1216/12

moonsettler avatar Nov 11 '24 23:11 moonsettler

Switched back to markdown. Header now in BIP-2 format.

moonsettler avatar Nov 13 '24 20:11 moonsettler

The original create date of OP_PAIRCOMMIT is 2024-03-15 this is the latest revision based on feedback from Anthony Towns. https://gist.github.com/moonsettler/d7f1fb88e3e54ee7ecb6d69ff126433b/revisions What date should go to the header?

moonsettler avatar Nov 13 '24 21:11 moonsettler

Added a discussion link to the PR description.

jonatack avatar Nov 14 '24 02:11 jonatack

According to BIP 2:

The Created header records the date that the BIP was assigned a number, […]

murchandamus avatar Nov 14 '24 15:11 murchandamus

Has this proposal been sent to the mailing list?

murchandamus avatar Nov 14 '24 22:11 murchandamus

Has this proposal been sent to the mailing list?

~~Not yet. Wanted to get it into an acceptable shape before I post it there.~~

Proposed to the mailing list, waiting for feedback.

moonsettler avatar Nov 14 '24 22:11 moonsettler

It looks like we gonna have to amend the PAIRCOMMIT BIP with some new use cases. Turns out within certain practical limitations any computational function can be proven out in the form of a merkle tree. The root hash of the merkle tree represents the function the leaves represent the inputs and output. Any 32 bit arithmetic function can certainly be proven out with this method. CAT itself with a limited set of inputs or limited input sizes can be proven out. At this point it's an open question if this enables new behaviors not enabled by taproot MAST itself?

Special thanks to: @JeremyRubin @Ademan @bigspider

edit: Alternatively could consider imposing specific script limits that make PAIRCOMMIT explicitly less capable than MAST itself.

moonsettler avatar Nov 26 '24 20:11 moonsettler

I think I've changed my mind a bit. We were talking about computing a merkle tree for f(u32,u32) as if it was trivial but after a quick experiment it seems like that would take hundreds of years to compute (am I being dumb here?) Instead, you can compute mul(u32,u32) -> u32 using 3 mul(u16,u16)s which is feasible to compute. The witness size is worse, ~32 * 32 * 3 = 3072 instead of 32 * 64 * 1 = 2048, but computing the tree for mul(u16,u16) is feasible using a naive algorithm on commodity hardware.

The implication of this is that where a function can be decomposed into operations on smaller inputs, PAIRCOMMIT is massively more feasible to use than encoding things into a tap tree.

Ademan avatar Nov 27 '24 05:11 Ademan

I think I've changed my mind a bit. We were talking about computing a merkle tree for f(u32,u32) as if it was trivial but after a quick experiment it seems like that would take hundreds of years to compute (am I being dumb here?) Instead, you can compute mul(u32,u32) -> u32 using 3 mul(u16,u16)s which is feasible to compute. The witness size is worse, ~32 * 32 * 3 = 3072 instead of 32 * 64 * 1 = 2048, but computing the tree for mul(u16,u16) is feasible using a naive algorithm on commodity hardware.

Arithmetic and bitwise operations where inputs & outputs are small enough, can already be done in Script in cheaper ways. Merkle trees as lookup tables are only interesting for functions that are either extremely complex, or where preimages/images are larger than what Script can work with. Note that you can already do small indexed lookup tables more efficiently by just hard-coding them in Script (that is: push the table on the stack and use OP_PICK to read its entries), and these techniques are widely used (e.g. in BitVM).

The implication of this is that where a function can be decomposed into operations on smaller inputs, PAIRCOMMIT is massively more feasible to use than encoding things into a tap tree.

I think the only substantial difference is that in a Script where you need several lookups, you can do it with Merkle trees, while you can only do a single lookup with a precomputed taptree.

bigspider avatar Nov 27 '24 09:11 bigspider

Proving general computation

Merkle trees can be used to prove out computation where the root of the tree represents the function and the leaves represent the inputs and output. There are practical limits to the entropy space for the inputs as it needs to be iterated over and hashed up.

Currently MAST trees can cover 128 bits of entropy space, which is well over the practical limits to iterate over and merklize. Therefore we assume this capability does not materially extend what computations are possible to prove out in bitcoin script. While OP_PAIRCOMMIT is not limited to a height of 128, that should not be practically feasible to utilize.

There is a way to reduce the size of the witness for proving out computation, by eliminating the merkle path inclusion proofs, using OP_CHECKSIGFROMSTACK together with OP_PAIRCOMMIT. This method involves deleted key assumptions, most likely using MPC to create an enormous amount of signatures for the stack elements representing the inputs and the output of the function.

Is this correct? Any suggestions? @Ademan @bigspider

moonsettler avatar Nov 27 '24 14:11 moonsettler

The implication of this is that where a function can be decomposed into operations on smaller inputs, PAIRCOMMIT is massively more feasible to use than encoding things into a tap tree.

This is the main open question I believe. does it or does it not practically expand what we can already do? For example using PC to emulate smolCAT and using traditional methods with lookup tables could make 32 bit or even 64 bit arithmetics more feasible?

edit: Within the 32 bit realm we can already use OP_ADD, I see little practical diff between <0x1234> <0x5678> CAT and <0x12340000> <0x5678> ADD. And it sounds like 64 bit smolCAT would be way too expensive to generate (and also to interact with trustlessly).

(actually the above examples are wrong, because internally bitcoin script uses little endian, but should convey the point)

moonsettler avatar Nov 27 '24 14:11 moonsettler

...

Arithmetic and bitwise operations where inputs & outputs are small enough, can already be done in Script in cheaper ways. Merkle trees as lookup tables are only interesting for functions that are either extremely complex, or where preimages/images are larger than what Script can work with. Note that you can already do small indexed lookup tables more efficiently by just hard-coding them in Script (that is: push the table on the stack and use OP_PICK to read its entries), and these techniques are widely used (e.g. in BitVM).

Even u16,u16 is quite a bit larger than I think is practical as a lookup table, but the efficiency for repeated operations is constant, obviously. The lookup table is less efficient for small numbers of operations (a u8,u8 table is 16k vs 1 u8,u8 proof is 0.4k) but the merkle tree loses quickly when those operations are repeated.

The implication of this is that where a function can be decomposed into operations on smaller inputs, PAIRCOMMIT is massively more feasible to use than encoding things into a tap tree.

I think the only substantial difference is that in a Script where you need several lookups, you can do it with Merkle trees, while you can only do a single lookup with a precomputed taptree.

Right, and the key point is these merkle trees and lookup tables rapidly become infeasible to compute as the input size grows, so multiple smaller lookups is significantly more useful.

EDIT: But your point is well taken that for smaller operations they can already be better accomplished by lookup tables.

Ademan avatar Nov 27 '24 15:11 Ademan

... edit: Within the 32 bit realm we can already use OP_ADD, I see little practical diff between <0x1234> <0x5678> CAT and <0x12340000> <0x5678> ADD. And it sounds like 64 bit smolCAT would be way too expensive to generate (and also to interact with trustlessly).

(actually the above examples are wrong, because internally bitcoin script uses little endian, but should convey the point)

Yeah for arbitrary 8 byte strings smolCAT seems infeasible to compute the table or merkle tree for. After a bit of conversation on IRC it could probably be feasible for arbitrary f(b[4],b[4]) -> b[8] with a custom ASIC¹ or maybe a cluster of FPGAs in a span of ~a few years but that would not be very useful for the average person.

Bit shifts over 32 bit integers seems pretty feasible though, that's f(u32,u6)->u32 (maybe save some space by special casing shift = 0). it seems like my incredibly naive, unoptimized, single-core experiment could calculate that merkle tree in ~96 hours. Of course the proof is ~1.2k and users would likely need multiple, but the lookup table for that wouldn't fit in a block anyway so maybe something new is possible?

You can also separate positive and negative shifts, and maybe break it down into multiple rounds of shifts 1-3 or something (or 1k for a proof for a constant shift)

[1]: afaik existing ASICs operate on block headers so couldn't help

Ademan avatar Nov 27 '24 17:11 Ademan

I think this BIP is already way more verbose than it was supposed to be.

It would be useful if we could reference it by a number. Can we get a BIP number assigned? Not asking for a merge yet.

moonsettler avatar Nov 27 '24 21:11 moonsettler

Should we add this table to this BIP? And it's not just vBytes but also the number of sigops to consider, which is a cost all nodes on the p2p network have to bear.

edit: I think it looks like this:

Method ChannelS UpdateSc UpdateWi 1-Update 2-Update
APO-annex 2.25 vB 28.25 vB 25 vB 305 vB 461.5 vB
APO-return 2.25 vB 28.25 vB 16.5 vB 338.5 vB 528.5 vB
CTV+CSFS+IKEY 2.75 vB 12.25 vB 24.5 vB 331 vB 513 vB
CTV+CSFS 11 vB 20.5 vB 24.5 vB 347.5 vB 537.75 vB
LNhance 3 vB 12.5 vB 32.75 vB 297.75 vB 446.25 vB
CTV+CSFS+VAULT 18.75 vB 30 vB 35 vB 333.75 vB 502.25 vB
rekey 7.25 vB 16.75 vB 73.75 vB 347.25 vB 541 vB
Method ForceC Update Settle OP_RETURN
APO-annex 1 SigOp 1 SigOp 1 SigOp
APO-return 1 SigOp 1 SigOp 1 SigOp X
CTV+CSFS+IKEY 1 SigOp 1 SigOp CTV X
CTV+CSFS 1 SigOp 1 SigOp CTV X
LNhance 1 SigOp 1 SigOp CTV
CTV+CSFS+VAULT 2* SigOp 2* SigOp CTV
rekey 3 SigOp 3 SigOp CTV

* VAULT is not exactly a SigOp, but close enough. Has a budget cost of 1.2 SigOps.

moonsettler avatar Dec 05 '24 15:12 moonsettler

tbh i have no idea how to read that table, so might be good to have clearer labeling somehow / break down where the accounting came from?

JeremyRubin avatar Dec 06 '24 04:12 JeremyRubin

This is based on @reardencode's spreadsheet: https://docs.google.com/spreadsheets/d/1UNW1AV7F8Srf-vHL5F3bKnkezbHIMabQqhU21tV915o/edit?gid=0#gid=0

moonsettler avatar Dec 06 '24 09:12 moonsettler

It would be useful if we could reference it by a number. Can we get a BIP number assigned? Not asking for a merge yet.

Looking at the Motivation section, it seems to me that the main application for this proposal would be a construction that depends on three other undeployed proposals some of which are themselves draft stage or pre-draft. This proposal feels a bit hypothetical at this point. I’ll get back to you next week.

murchandamus avatar Dec 06 '24 19:12 murchandamus

This proposal feels a bit hypothetical at this point.

While PAIRCOMMIT would only be truly useful with certain other future upgrades, it is proposed to activate in a bundle with said updates. We are in the process of trying to reach consensus on said package.

It's unlikely I would withdraw OP_PAIRCOMMIT, unless OP_CAT got activated first.

moonsettler avatar Dec 06 '24 19:12 moonsettler

What date should go to the header?

The Created header records the date that the BIP was assigned a number (see BIP-2).

jonatack avatar Dec 07 '24 16:12 jonatack

I get that there is a goal here to avoid introspection...

but it seems that it'd be more generically useful if the function were e.g., a TapBranch function, so then it could be used in the future with some other taproot editing opcodes.

e.g., if OP_TAPBRANCHCOMMIT were to lexicographic sort, then cat, then commit, you could do paircommit like functionality by doing:

<a> <b> TUCK OP_TAPBRANCHCOMMIT OP_TAPBRANCHCOMMIT

this works because while the first commit commits to either a||b or b||a, and also has sliceable errors (e.g., (a||b)[:i] and (a||b)[i:]), the second commit re-commits to <b> and it's length which uniquely determines order and prevents sliceable errors.

One concern: OP_TAPBRANCHCOMMIT is witness "malleable", in that items could show up in the witness stack in either order and get the same result. It'd still be possible to make non-malleable witnesses by requiring the stack elements to be in order with a OP_LEXSORT or equivalent functionality.

JeremyRubin avatar Dec 24 '24 17:12 JeremyRubin

If we did OP_TAPBRANCHCOMMIT then with OP_CHECKCONTRACTVERIFY I believe you get TLUV behavior? You can verify a tapleaf on the input and can also replace it with an other on the output?

To be honest the OP_PAIRCOMMIT domain separation was aimed to prevent the possibility of such uses as potentially being more controversial.

moonsettler avatar Dec 24 '24 20:12 moonsettler

I think it's actually less controversial, because if you do OP_TAPBRANCHCOMMIT you're doing something that even once you have OP_CAT, is really handy to have (because sorting two strings is actually still pretty hard with CAT).

Whereas now you're getting a lot of people thinking that paircommit is not so useful if CAT gets in eventually.

JeremyRubin avatar Dec 25 '24 00:12 JeremyRubin

Right. I did consider something like a sorting merkle operator for lamport stuff for example, however when you need order dependent commitments (which is pretty much everything we are doing with it rn) then you would have to use a dummy value like: <a> <1> <b> OP_TAPBRANCHCOMMIT OP_TAPBRANCHCOMMIT or maybe: <a> <b> OP_SHA256 OP_TAPBRANCHCOMMIT instead of: <a> <b> OP_PAIRCOMMIT So if I wanted to support both of these functionalities I would probably end up with 2 separate opcodes for them.

moonsettler avatar Dec 25 '24 01:12 moonsettler

I think this is ready for a merge from my point of view. Rebased to latest master and squashed all the changes to 1 commit.

moonsettler avatar Dec 28 '24 17:12 moonsettler

Maybe this is obvious to everyone else and I'm a slowpoke here, but PC+CSFS+CTV enables a kind of "multi-transaction-signature". You sign a merkle root that commits to a bunch of CTV transaction templates, and you can provide an inclusion proof with PC with the signature.

I've been spitballing with someone about a multiparty eltoo scheme and came up with a way to bound the state resolution but without state carrying covenants it requires a possible field of ~2^N transactions (of which, at most N will end up on-chain), and every party shares N*(N+1)/2 signatures.

With PC+CSFS+CTV it still requires ~2^N transactions, but each party only needs to share their 1 signature over the merkle root.

EDIT: Not 100% sure it works for my mutliparty eltoo scheme just yet, but I'm still >50% confident it does... EDIT EDIT: More confident it works, but it looks like the number of transactions is exponential rather than polynomial... and so my estimate for how many signatures would need to be shared without the "multi-transaction-signature" commitment is probably way low. EDIT EDIT EDIT: I genuinely have no idea how I decided "multi-signature" was an acceptable term considering it already has a different meaning in Bitcoin... s/"multi-signature"/"multi-transaction-signature"/g

Ademan avatar Dec 28 '24 18:12 Ademan

Thanks so much Jon. we have a significant revision coming that does include defining abbreviations. @moonsettler should we move this to draft until we finish workshopping the revisions we're working on?

reardencode avatar Jan 04 '25 04:01 reardencode

We have a rework of the BIP, the original can be viewed here

Please provide feedback on which approach you think is better! @murchandamus @jonatack

moonsettler avatar Feb 08 '25 18:02 moonsettler

Hey, I see that there was a force push, but it is not clear to me whether the review comments have been addressed. Looking at a couple at least some of them seem to still apply. Could you please process the review comments, either reply on them if there is information to share or the conversation is still on-going, or resolve them if you have applied them or rejected them? Please mark this pull request as "Ready for review" when you would like another editor review.

murchandamus avatar Mar 21 '25 16:03 murchandamus

Hey, I see that there was a force push, but it is not clear to me whether the review comments have been addressed. Looking at a couple at least some of them seem to still apply. Could you please process the review comments, either reply on them if there is information to share or the conversation is still on-going, or resolve them if you have applied them or rejected them? Please mark this pull request as "Ready for review" when you would like another editor review.

Sorry, I don't think we addressed everything. Should go through every suggestion one by one. Right now I can't prioritize this sadly. Not sure if @reardencode wishes to pick up the slack in the meantime.

moonsettler avatar Apr 28 '25 22:04 moonsettler