sigma-rust icon indicating copy to clipboard operation
sigma-rust copied to clipboard

Unsafe unwrap in constant substitution

Open Shimuuar opened this issue 3 years ago • 5 comments

I don't think calling unwrap is safe here. It's quite easy to create ergo tree that references nonexistent constant

https://github.com/ergoplatform/sigma-rust/blob/3e09084d7acbe1cabd7424e5c31c69a1bf09ffa1/ergotree-ir/src/serialization/expr.rs#L92-L96

Also why constant substitution is even needed? It breaks roundtrip property. If constants are substituted it's no longer possible to serialize script back.

Shimuuar avatar Jun 29 '21 09:06 Shimuuar

I don't think calling unwrap is safe here. It's quite easy to create ergo tree that references nonexistent constant

https://github.com/ergoplatform/sigma-rust/blob/3e09084d7acbe1cabd7424e5c31c69a1bf09ffa1/ergotree-ir/src/serialization/expr.rs#L92-L96

Great catch! I wonder, why clippy missed it although we have #![deny(clippy::unwrap_used)] in lib.rs. You're right, my comment there is overly optimistic (it assumes no malicious intentions).

Also why constant substitution is even needed? It breaks roundtrip property. If constants are substituted it's no longer possible to serialize the script back.

We can have the interpreter evaluate a tree with constant placeholders giving that we provide constants for them separately. However, we also need the existing interpreter mode with constant values, since constant segregation is an additional step.

I believe the were multiple reasons for constant segregation:

  • To have a way of changing the contract parameters. Like if you know a PK has index 1, you can change it. See ErgoTree::with_constant.
  • Speed up AOT costing. If template bytes (a serialized tree with constant placeholders) are the same, no need to check it again.
  • Template bytes to have a role of a reference/id to the contract (like particular DEX orders will all have the same template bytes).

greenhat avatar Jun 29 '21 10:06 greenhat

I had rather different question in mind. Not why ConstantPlaceholder in useful but rather whether is it good idea to replace ConstantPlaceholder with Constant. In that case expression could be serialized back (result will be different). Also AFAIR evaluator can't handle ConstantPlaceholder so currently if one wants faithful serialization he couldn't have evaluation and vice versa. All in all it makes a great footgun: accidentally replace constants and serialize script back

Maybe it's better to drop substitute_placeholders switch altogether? Few other places will require ConstantPlaceholderas parameter

Shimuuar avatar Jun 29 '21 16:06 Shimuuar

I had rather different question in mind. Not why ConstantPlaceholder in useful but rather whether is it good idea to replace ConstantPlaceholder with Constant. In that case expression could be serialized back (result will be different). Also AFAIR evaluator can't handle ConstantPlaceholder so currently if one wants faithful serialization he couldn't have evaluation and vice versa. All in all it makes a great footgun: accidentally replace constants and serialize script back Maybe it's better to drop substitute_placeholders switch altogether? Few other places will require ConstantPlaceholderas parameter

I'm not sure I get the "why an expression could not be serialized back" thought. ErgoTree header flag decides if constant should be segregated during the serialization. See https://github.com/ergoplatform/sigma-rust/blob/f0b613f8307195f044092b7cc5fe30650a286970/ergotree-ir/src/serialization/expr.rs#L191-L198 where the constants get segregated. So when an expression has placeholders substituted on parsing it will undergo the opposite procedure(constant segregation) on serialization.

greenhat avatar Jun 30 '21 07:06 greenhat

Such transformations always carry risk of failng to roundtrip correctly. Especially for corner cases. Few examples off top of my head:

  1. Serialization code doesn't try to test if constant is already in store. So it will fail to roundtrip any script which has two identical ConstantPlaceholder 2.Any script that mixes Constant and ContantPlaceholder will fail to roundtrip.

I suspect ergoscript compiler won't produce such scripts but it's entirely possible to write them by hand. 1) could happen as way of script size optimization and 2) as a way to keep values which should be constants anyway in the body of script

Shimuuar avatar Jun 30 '21 09:06 Shimuuar

Such transformations always carry risk of failng to roundtrip correctly. Especially for corner cases. Few examples off top of my head:

  1. Serialization code doesn't try to test if constant is already in store. So it will fail to roundtrip any script which has two identical ConstantPlaceholder 2.Any script that mixes Constant and ContantPlaceholder will fail to roundtrip.

I suspect ergoscript compiler won't produce such scripts but it's entirely possible to write them by hand. 1) could happen as way of script size optimization and 2) as a way to keep values which should be constants anyway in the body of script

Good points! We can safely dim them as invalid trees and act accordingly. I mean ErgoTree flag for constant segregation is binary so if both constant and placeholders are in the tree it's invalid. ConstantPlaceholder is not introduced on the compiler level, but later on the serialization stage. The Expr serialization should not be considered independently, only as a part of the ErgoTree. I'd rather deal with placeholder substitution on the parsing stage than later in the interpreter.

greenhat avatar Jun 30 '21 10:06 greenhat