sigma-rust
sigma-rust copied to clipboard
Unsafe unwrap in constant substitution
I don't think calling unwrap is safe here. It's quite easy to create ergo tree that references nonexistent constant
https://github.com/ergoplatform/sigma-rust/blob/3e09084d7acbe1cabd7424e5c31c69a1bf09ffa1/ergotree-ir/src/serialization/expr.rs#L92-L96
Also why constant substitution is even needed? It breaks roundtrip property. If constants are substituted it's no longer possible to serialize script back.
I don't think calling unwrap is safe here. It's quite easy to create ergo tree that references nonexistent constant
https://github.com/ergoplatform/sigma-rust/blob/3e09084d7acbe1cabd7424e5c31c69a1bf09ffa1/ergotree-ir/src/serialization/expr.rs#L92-L96
Great catch! I wonder, why clippy missed it although we have #![deny(clippy::unwrap_used)]
in lib.rs
. You're right, my comment there is overly optimistic (it assumes no malicious intentions).
Also why constant substitution is even needed? It breaks roundtrip property. If constants are substituted it's no longer possible to serialize the script back.
We can have the interpreter evaluate a tree with constant placeholders giving that we provide constants for them separately. However, we also need the existing interpreter mode with constant values, since constant segregation is an additional step.
I believe the were multiple reasons for constant segregation:
- To have a way of changing the contract parameters. Like if you know a PK has index 1, you can change it. See
ErgoTree::with_constant
. - Speed up AOT costing. If template bytes (a serialized tree with constant placeholders) are the same, no need to check it again.
- Template bytes to have a role of a reference/id to the contract (like particular DEX orders will all have the same template bytes).
I had rather different question in mind. Not why ConstantPlaceholder
in useful but rather whether is it good idea to replace ConstantPlaceholder
with Constant
. In that case expression could be serialized back (result will be different). Also AFAIR evaluator can't handle ConstantPlaceholder
so currently if one wants faithful serialization he couldn't have evaluation and vice versa. All in all it makes a great footgun: accidentally replace constants and serialize script back
Maybe it's better to drop substitute_placeholders
switch altogether? Few other places will require ConstantPlaceholder
as parameter
I had rather different question in mind. Not why
ConstantPlaceholder
in useful but rather whether is it good idea to replaceConstantPlaceholder
withConstant
. In that case expression could be serialized back (result will be different). Also AFAIR evaluator can't handleConstantPlaceholder
so currently if one wants faithful serialization he couldn't have evaluation and vice versa. All in all it makes a great footgun: accidentally replace constants and serialize script back Maybe it's better to dropsubstitute_placeholders
switch altogether? Few other places will requireConstantPlaceholder
as parameter
I'm not sure I get the "why an expression could not be serialized back" thought. ErgoTree header flag decides if constant should be segregated during the serialization. See https://github.com/ergoplatform/sigma-rust/blob/f0b613f8307195f044092b7cc5fe30650a286970/ergotree-ir/src/serialization/expr.rs#L191-L198 where the constants get segregated. So when an expression has placeholders substituted on parsing it will undergo the opposite procedure(constant segregation) on serialization.
Such transformations always carry risk of failng to roundtrip correctly. Especially for corner cases. Few examples off top of my head:
- Serialization code doesn't try to test if constant is already in store. So it will fail to roundtrip any script which has two identical
ConstantPlaceholder
2.Any script that mixesConstant
andContantPlaceholder
will fail to roundtrip.
I suspect ergoscript compiler won't produce such scripts but it's entirely possible to write them by hand. 1) could happen as way of script size optimization and 2) as a way to keep values which should be constants anyway in the body of script
Such transformations always carry risk of failng to roundtrip correctly. Especially for corner cases. Few examples off top of my head:
- Serialization code doesn't try to test if constant is already in store. So it will fail to roundtrip any script which has two identical
ConstantPlaceholder
2.Any script that mixesConstant
andContantPlaceholder
will fail to roundtrip.I suspect ergoscript compiler won't produce such scripts but it's entirely possible to write them by hand. 1) could happen as way of script size optimization and 2) as a way to keep values which should be constants anyway in the body of script
Good points! We can safely dim them as invalid trees and act accordingly. I mean ErgoTree flag for constant segregation is binary so if both constant and placeholders are in the tree it's invalid. ConstantPlaceholder
is not introduced on the compiler level, but later on the serialization stage. The Expr
serialization should not be considered independently, only as a part of the ErgoTree. I'd rather deal with placeholder substitution on the parsing stage than later in the interpreter.