act Unrolling loops with helper functions

When specifying the behaviour of functions that involves a loop, it is sometimes hard to express the post-state on a closed form. A real life example is Pot.rpow, which calculates exponentials of fixed-point integers, or the deposit function of the Eth2.0 registration contract.

Here I propose a syntax which introduces a where construct that I think could handle many such cases. But first, I would like to express the shortcomings of the proposal

In the general case, the result of a loop is difficult to express in a functional language. Especially if it updates storage in unstructured ways. For example, something like

for (i = 0; i < N; i++) {
    assembly {
       sstore(i,i)
   }
}

seems hard to express elegantly in our current setting. However, such contracts are arguably pretty strange anyway.

But in most use cases, when the write-access of a loops aren't determined dynamically, the behaviour of the loop can be expressed as with the help of a recursive, pure function. To allow for such helper functions, I propose to add a new header, where, where they can be defined.

For example, the semantics of:

uint[] zeroHashes;
function zerohashN(uint256 n) { 
bytes32 res = "0x0";
for (i = 0; i <= n; i++;) {
    res += keccak256(res);
}
zeroHashes[n] = s;

is captured by:

behaviour zeroHashN of Arithmetic
interface zeroHashN(uint256 n)

storage

   zeroHashes[n] |-> _ => nth_hash_of(0, n)

where

   nth_hash_of(h, n) = if n == 0 then h else hash(nth_hash_of(h, n - 1))

Nov 13 '19 21:11 MrChico

This example came up as I was writing https://github.com/ethereum/smart-contract-spec-lang/blob/bnfc/examples/deposit/deposit-spec.md#deposit

Nov 13 '19 21:11 MrChico

Huge bikeshedding opportunity on using => or = in the function definitions for the where header. VOTE NOW

Nov 13 '19 22:11 MrChico

Is the = operator in the where meaningfully semantically different than the => operator in the above storage statement? cracks open the door to the bikeshed

Nov 16 '19 05:11 djrtwo

@djrtwo they are similar in the sense that the semantics of both is that the LHS rewrites to the RHS, but their difference is akin to that of big step vs. small step semantics: the RHS of a => would be the post state after the entire function call has been processed, whereas the RHS of a where function represents an intermediate state of a calculation necessary to be able to express the former.

They would also be treated somewhat differently by proving engines. I imagine the helper functions defined in the where header would need to be translated into the native language of the prover as independent functions, so that the full step transitions could depend on them

Nov 16 '19 17:11 MrChico

@MrChico how about

behaviour zeroHashN of Arithmetic
interface zeroHashN(uint256 n)

storage

   zeroHashes[n] |-> _ => h | nth_hash_of(0, n, h)

where

   nth_hash_of(h_in, 0, 0)
   nth_hash_of(h_in, n, h_out) <= nth_hash_of(h_in, n - 1, h), hash(h, h_out)

Nov 21 '19 18:11 leonardoalt

I guess it's just a matter of function vs predicate ;)

Nov 21 '19 18:11 leonardoalt

But I do find predicates more general for these. You could write a loop invariant predicate that (to me) would be more expressive than a function, and you could still update storage with it.

Nov 21 '19 18:11 leonardoalt

@MrChico how about

behaviour zeroHashN of Arithmetic
interface zeroHashN(uint256 n)

storage

   zeroHashes[n] |-> _ => h | nth_hash_of(0, 0, h)

where

   nth_hash_of(h_in, 0, 0)
   nth_hash_of(h_in, n, h_out) <= nth_hash_of(h_in, n - 1, h), hash(h, h_out)

Sorry, I don't understand what this means

Nov 21 '19 20:11 MrChico

You can read it as:

nth_hash_of h_in for n = 0 is 0
nth_hash_of h_in for any n is h_out if nth_hash_of h_in n - 1 times is h, and hash of h is h_out.

Nov 22 '19 11:11 leonardoalt

Oh, I see. But where do you define these predicates? Are they left uninterpeted or is there some logic programming conventions here that I'm unaware of?

Also, why are they more general? It seems you're just asserting a relation between a functions input and output...

Nov 22 '19 15:11 MrChico

This is the interpretation of the predicate, as long as hash is also defined. Note that the second case does cover all generic inputs, and the base case is sufficient to guarantee termination, so there's no uninterpreted scenario.

I take that back. I don't think it's more general. I just think the syntax is easier to apply (very biased here I'm sure), in the sense that you can capture inductive invariants with a single predicate inv(x, y, z) which is defined later (in where), without having to assign to a storage slot/variable.

Nov 22 '19 16:11 leonardoalt

rpow example:

    function rpow(uint x, uint n, uint b) public pure returns (uint z) {
      assembly {
        switch x case 0 {switch n case 0 {z := b} default {z := 0}}
        default {
          switch mod(n, 2) case 0 { z := b } default { z := x }
          let half := div(b, 2)  // for rounding.
          for { n := div(n, 2) } n { n := div(n,2) } {
            let xx := mul(x, x)
            if iszero(eq(div(xx, x), x)) { revert(0,0) }
            let xxRound := add(xx, half)
            if lt(xxRound, xx) { revert(0,0) }
            x := div(xxRound, b)
            if mod(n,2) {
              let zx := mul(z, x)
              if and(iszero(iszero(x)), iszero(eq(div(zx, x), z))) { revert(0,0) }
              let zxRound := add(zx, half)
              if lt(zxRound, zx) { revert(0,0) }
              z := div(zxRound, b)
            }
          }
        }
      }
    }

behaviour rpow of Jug
interface rpow(uint x, uint n, uint b)

returns rpow(x, n, b, 0)

where 
      rpow(x, 0, b, acc) = b
      rpow(x, 1, b, acc) = acc
      rpow(x, n, b, acc) = rpow(x, n - 1, b, ((acc * x) + b / 2) / b)

Nov 25 '19 15:11 MrChico

After some more discussions, here's some key factors that lead us to tentatively agree on the "functional" format:

It's probably easier to understand for most of our target audience, assuming that one doesn't go too deep into the swamp of higher abstractions.
While the logical format can ease the proving of such claims, we want to keep specifications independent of: a) The implementation details of the contracts and b) The methods used to prove the specifications

Both of these properties should aid in the readability of specifications allowing readers to know what the contract does without distracting them with the details of how it is done or how the proof that demonstrates that it actually does it is conducted.

Furthermore, when proving "higher level properties" about the contract (i.e. properties over the set of atomic state transitions (i.e. "big step" properties)), neither a) and b) are relevant, and is disregarded anyway.

We still need to have ways of expressing invariants or lemmas in a comprehensible, back-end independent manner, but the point here is that this shouldn't be in scope for behaviour specs, but rather something that belongs here

Nov 25 '19 17:11 MrChico