bips
bips copied to clipboard
BIP for reenabling OP_CAT
This BIP defines OP_CAT a new tapscript opcode which allows the concatenation of two values on the stack. This opcode would be activated via a soft fork by redefining the opcode OP_SUCCESS126.
See our implementation PR in bitcoin-inquisition: https://github.com/bitcoin-inquisition/bitcoin/pull/39
Definitely looking forward to test drive this BIP.
Can we get a BIP number assigned? Any blockers to doing this?
@kallewoof The BIP file is still name bip-???-cat.mediawiki. How do we go about numbering this BIP?
@kallewoof The BIP file is still name
bip-???-cat.mediawiki. How do we go about numbering this BIP?
Once a BP number has been assigned, the authors are expected to fill that in in the right places.
@kallewoof The BIP file is still name
bip-???-cat.mediawiki. How do we go about numbering this BIP?Once a BP number has been assigned, the authors are expected to fill that in in the right places.
Thanks for the reply. Of course, we'll change and fill the bip number wherever it's needed. Has this BIP already been assigned a number? I might have missed that comment
Missing a section for backward compatibility
@luke-jr Added!
@luke-jr We are requesting a BIP number for this PR
Any updates on this, would love to see this BIP merged to make verification of FRI based zero knowledge proofs (pay to zkp) more efficient
I would like to propose to replace the per-item size limit with a total stack size limit that is equivalent and can be introduced as a softfork as well. Since we currently allow 1000 items of 520 bytes, I would propose a 520kB (520,000 bytes) size limit for the entire stack + altstack.
This can be implemented as such:
diff --git a/src/script/interpreter.cpp b/src/script/interpreter.cpp
index 1583b5fadd3..cbe1695bab2 100644
--- a/src/script/interpreter.cpp
+++ b/src/script/interpreter.cpp
@@ -545,8 +545,6 @@ bool EvalScript(std::vector<std::vector<unsigned char> >& stack, const CScript&
return set_error(serror, SCRIPT_ERR_INVALID_STACK_OPERATION);
valtype& vch1 = stacktop(-2);
valtype& vch2 = stacktop(-1);
- if (vch1.size() + vch2.size() > MAX_SCRIPT_ELEMENT_SIZE)
- return set_error(serror, SCRIPT_ERR_INVALID_STACK_OPERATION);
vch1.insert(vch1.end(), vch2.begin(), vch2.end());
stack.pop_back();
}
@@ -1287,8 +1285,17 @@ bool EvalScript(std::vector<std::vector<unsigned char> >& stack, const CScript&
}
// Size limits
+ // limit number of elements in stacks
if (stack.size() + altstack.size() > MAX_STACK_SIZE)
return set_error(serror, SCRIPT_ERR_STACK_SIZE);
+ // limit total number of bytes in stacks
+ unsigned int total_bytes = 0;
+ for (size_t i = 0; i < stack.size() ; i++)
+ total_bytes += stack[i].size();
+ for (size_t i = 0; i < altstack.size() ; i++)
+ total_bytes += altstack[i].size();
+ if (total_bytes > MAX_STACK_BYTES)
+ return set_error(serror, SCRIPT_ERR_STACK_SIZE);
}
}
catch (...)
diff --git a/src/script/script.h b/src/script/script.h
index 305919f5ba4..ef40c3672ee 100644
--- a/src/script/script.h
+++ b/src/script/script.h
@@ -38,6 +38,9 @@ static const int MAX_SCRIPT_SIZE = 10000;
// Maximum number of values on script interpreter stack
static const int MAX_STACK_SIZE = 1000;
+// Maximum number of total bytes on script interpreter stack
+static const int MAX_STACK_BYTES = 520000; // 520 * 1000
+
// Threshold for nLockTime: below this value it is interpreted as block number,
// otherwise as UNIX timestamp.
static const unsigned int LOCKTIME_THRESHOLD = 500000000; // Tue Nov 5 00:53:20 1985 UTC
NACK
OP_CAT enables recursion which is computational equivalent to iteration; thus making Script Turing complete.
@ChrisMartl How does OP_CAT add recursion? What's your source for this claim?
In case this claim originates with something I've said publicly: I am confident that CAT does not introduce recursion to Script and I have never (intentionally) said that it did.
CAT does introduce a form of recursive covenants. But the form of "recursion" here is one that involves a fresh transaction for every iteration, and does not change the computational model at all.
To stevenroose's proposal. https://github.com/bitcoin/bips/pull/1525#issuecomment-1979297869
I would like to propose to replace the per-item size limit with a total stack size limit that is equivalent and can be introduced as a softfork as well. Since we currently allow 1000 items of 520 bytes, I would propose a 520kB (520,000 bytes) size limit for the entire stack + altstack.
This can be implemented as such:
diff --git a/src/script/interpreter.cpp b/src/script/interpreter.cpp index 1583b5fadd3..cbe1695bab2 100644 --- a/src/script/interpreter.cpp +++ b/src/script/interpreter.cpp @@ -545,8 +545,6 @@ bool EvalScript(std::vector<std::vector<unsigned char> >& stack, const CScript& return set_error(serror, SCRIPT_ERR_INVALID_STACK_OPERATION); valtype& vch1 = stacktop(-2); valtype& vch2 = stacktop(-1); - if (vch1.size() + vch2.size() > MAX_SCRIPT_ELEMENT_SIZE) - return set_error(serror, SCRIPT_ERR_INVALID_STACK_OPERATION); vch1.insert(vch1.end(), vch2.begin(), vch2.end()); stack.pop_back(); } @@ -1287,8 +1285,17 @@ bool EvalScript(std::vector<std::vector<unsigned char> >& stack, const CScript& } // Size limits + // limit number of elements in stacks if (stack.size() + altstack.size() > MAX_STACK_SIZE) return set_error(serror, SCRIPT_ERR_STACK_SIZE); + // limit total number of bytes in stacks + unsigned int total_bytes = 0; + for (size_t i = 0; i < stack.size() ; i++) + total_bytes += stack[i].size(); + for (size_t i = 0; i < altstack.size() ; i++) + total_bytes += altstack[i].size(); + if (total_bytes > MAX_STACK_BYTES) + return set_error(serror, SCRIPT_ERR_STACK_SIZE); } } catch (...) diff --git a/src/script/script.h b/src/script/script.h index 305919f5ba4..ef40c3672ee 100644 --- a/src/script/script.h +++ b/src/script/script.h @@ -38,6 +38,9 @@ static const int MAX_SCRIPT_SIZE = 10000; // Maximum number of values on script interpreter stack static const int MAX_STACK_SIZE = 1000; +// Maximum number of total bytes on script interpreter stack +static const int MAX_STACK_BYTES = 520000; // 520 * 1000 + // Threshold for nLockTime: below this value it is interpreted as block number, // otherwise as UNIX timestamp. static const unsigned int LOCKTIME_THRESHOLD = 500000000; // Tue Nov 5 00:53:20 1985 UTC
This seems to require checking the size limit after every opcode is computed, and it might not be scalable. I think this may need to be left to a different PR. Calculating the delta and only updating the delta is possible, but it would need to change the implementation a lot.
I would like to propose to replace the per-item size limit with a total stack size limit that is equivalent and can be introduced as a softfork as well. Since we currently allow 1000 items of 520 bytes, I would propose a 520kB (520,000 bytes) size limit for the entire stack + altstack.
Consider the following script:
"1234567" DUP CAT 8 CAT
DUP CAT DUP CAT DUP CAT DUP CAT DUP CAT DUP CAT DUP CAT
DUP CAT DUP CAT DUP CAT DUP CAT DUP CAT DUP CAT DUP CAT
1 SHA256 OVER CAT SHA256
[ OVER CAT SHA256 ] * 130000
CAT SHA256
It constructs a ~246kB string on the stack consisting of lots of repeats of "123456712345678", then repeatedly concatenates that string to a sha256 hash, using the hash of the result as the new sha256 hash. Doing this 130k times means you're hashing about 32GB of data despite never having more than about 492kB on the stack in total, while still fitting in a standard tx with less than 400k weight units. For comparsion, constraining each stack element to being 520 bytes limits the total amount hashed to under 208MB for any standard transaction.
Assigned BIP 347.
Does this mean we can also formally propose OP_{LEFT,RIGHT,SUBSTR} now too? :heart_eyes:
@delbonis anybody can propose anything formally, by running through the process of creating a proposal, soliciting input on the mailing list, etc.
But the three opcodes you mention can be fairly-cheaply emulated using CAT, and have fewer direct usecases, so it may not be a productive direction to pursue.
LEFT/RIGHT/SUBSTR/SPLIT etc are actually a bit nicer in some ways than CAT because they only create same size or smaller elements on the stack, forgoing the concerns of max element size / working seamlessly if the element size should ever be changed to allow larger.
Please keep comments focused on technical review of the proposal -- thanks.
@delbonis @JeremyRubin Re: OP_LEFT/RIGHT/SUBSTR
There are still values in OP_LEFT/RIGHT/SUBSTR. To emulate them with OP_CAT, one needs to provide a hint, and this takes space in the witness stack / input stack. This is inconvenient because the witness stack can have at most 1000 elements.
In our practice with Merkle trees for STARK proof verifier that uses OP_CAT, for each layer, we need to provide a sibling element. A standard transaction only prefers binary Merkle tree. And therefore, there are log_{2} n elements that we need to push in the witness stack. (nonstandard transaction is an easier situation, which allows log_{16} n) OP_LEFT/RIGHT/SUBSTR, if together with a nonstandard transaction, can allow one to pack elements compactly to save stack space.
Here is the related codebase: https://github.com/Bitcoin-Wildlife-Sanctuary/bitcoin-circle-stark?tab=readme-ov-file#performance
But I think people can agree that bringing back OP_SUBSTR is enough. This allows us to save other successful opcodes for other purposes.
That said, this is not related to the OP_CAT BIP discussion, and may be better discussed in other venues.
Could you point me toward where i can learn more about the stack size limit referred? I'm not technically savvy enough to catch how the 520 bytes stack limit is imposed on OP_CAT operation when there are no mention of the rationale in the commit. Is it enforced by tapscript? @EthanHeilman thank you
Thanks for your question!
Could you point me toward where i can learn more about the stack size limit referred? @OminousPeace
In the OP_CAT reference implementation in BIP-347:
case OP_CAT:
{
if (stack.size() < 2)
return set_error(serror, SCRIPT_ERR_INVALID_STACK_OPERATION);
valtype& vch1 = stacktop(-2);
valtype& vch2 = stacktop(-1);
if (vch1.size() + vch2.size() > MAX_SCRIPT_ELEMENT_SIZE)
return set_error(serror, SCRIPT_ERR_PUSH_SIZE);
vch1.insert(vch1.end(), vch2.begin(), vch2.end());
stack.pop_back();
}
break;
We check that their are sufficient elements on the stack:
if (stack.size() < 2)
return set_error(serror, SCRIPT_ERR_INVALID_STACK_OPERATION);
Then we check that the CAT operation will not create a stack element larger than the max stack element:
if (vch1.size() + vch2.size() > MAX_SCRIPT_ELEMENT_SIZE)
return set_error(serror, SCRIPT_ERR_PUSH_SIZE);
MAX_SCRIPT_ELEMENT_SIZE is defined as 520 (bytes) in bitcoin-core.
// Maximum number of bytes pushable to the stack
static const unsigned int MAX_SCRIPT_ELEMENT_SIZE = 520;
OP_CAT is neutral with regard to the total size of stack because it joins two elements together which were already on the stack. That is, an OP_CAT operation should never increase or decrease the total size of stack in bytes.
For running code see our implementation of OP_CAT in bitcoin-inquisition.
Is it enforced by tapscript?
Both script and tapscript enforce the element stack size limit and the total stack size limit when a OP_PUSH occurs. OP_CAT like any other opcode which pushes data on the stack, must also ensure that it does not violate this limit, which is why returns an error if it would push an element greater than the maximum stack element size (520).
Rationale
We do provide a rationale for why we don't alter bitcoin consensus to increase the stack element size limit in the rationale section of the BIP.
Did that answer your question?
I put the BIP number in the title, I hope you don’t mind. It seems to me that there are still a couple open review comments, will swing by again later to check whether it’s ready for the next round of review.
@luke-jr had previously requested changes:
Missing a section for backward compatibility
which I consider to have been addressed. I will abstain from merging this PR for some time, to give other editors a chance to make their own assessment.
It seems useful to use the ACM DOI URI for that paper mentioned above with the nonfunctional link: https://dl.acm.org/doi/10.1145/2810103.2813686
It is very likely to still work for ten more years. Although it is paywalled, people can find free versions online.
The build check was failing, because the BIP had not been added to the table in the README yet, so I added a commit to add it to the table.
It seems useful to use the ACM DOI URI for that paper mentioned above with the nonfunctional link: dl.acm.org/doi/10.1145/2810103.2813686
It is very likely to still work for ten more years. Although it is paywalled, people can find free versions online.
Yep, sorry, we weren't allowed to upload it to eprint.iacr.org or arXiv due to copyright reasons. (The ACM is evil...)
But it's lawfully available under https://publications.cispa.saarland/565/1/penalizing.pdf (Web Archive: https://web.archive.org/web/20221023121048/https://publications.cispa.saarland/565/1/penalizing.pdf), and I've also just made it available under https://raw.githubusercontent.com/real-or-random/accas/master/paper.pdf (Web Archive: https://web.archive.org/web/20240506104315/https://raw.githubusercontent.com/real-or-random/accas/master/paper.pdf).
@real-or-random Just added a stable URL for Liar, Liar, Coins on Fire! citation.