solang Add constructor support to Solidity contracts

Soroban protocol 21 did not support constructors (functions exposed by the contract, which the Vm calls upon contract deployment). That means that a developer would have to call init manually from the command line.

init is a function that Solang embeds in Solidity contracts, which loops over storage variables and initializes them. (By calling another function storage_initializer) https://github.com/hyperledger-solang/solang/blob/5176166545df6090131565a990b303d9cfabf37e/src/emit/soroban/mod.rs#L254

What needs to be done: Upgrade protocol version in contract to be 22, and change the name init to __constructor, and add tests

Nov 02 '24 11:11 salaheldinsoliman

change the name init to __constructor

Just to make sure I understand this correctly, If we just rename init to __constructor, it seems we still wouldn’t fully support constructors as outlined in CAP-0058. Developers would still need to manually call it, as they currently do with init.

There are a few things I’m unsure about:

Since __constructor is optional, what happens if the developer doesn’t implement it?
- Do we just expose an empty __constructor function that only calls storage_initializer, similar to how init works now, and expect the developer to call it manually?
I noticed that the new __constructor method can also include custom logic that the developer wants to run on initialization, which might mean it can take parameters too. Will Solang support that?
Should __constructor still be callable after the contract is deployed and initialized?

Nov 08 '24 17:11 tareknaser

@tareknaser __constructor should call storage_initializer before executing custom initialization logic. (not only call storage_initializer)

Nov 08 '24 21:11 salaheldinsoliman

Alright, here’s what I understand about the desired behavior:

If the developer doesn’t implement __constructor, we should export a function called __constructor that only initializes storage variables by calling storage_initializer. This part seems straightforward; we could just rename what we have from init to __constructor.
If the developer defines their own __constructor function with custom code, then we need to ensure that __constructor first calls storage_initializer and then executes the custom code. This is the part that’s proving tricky.

I’ve tried a few approaches to handle the second point, but they haven’t worked out so far:

Modifying the developer’s __constructor function to first call storage_initializer. I couldn’t quite achieve this, but there might be a way with inkwell that I’m not yet aware of.
Overriding the developer’s __constructor implementation so it would call storage_initializer first, followed by the custom code and then re-export it. This also didn’t work out.

I’m continuing to debug, but I wanted to share these notes in case there’s a simpler approach that I might be missing. Do you have any suggestions?

Nov 12 '24 22:11 tareknaser

@tareknaser You are doing a great job! If you compile a simple contract to Soroban target, that has a simple constructor that prints a value:

contract counter {
    uint64  sesa = 0;
    constructor() public {
        print("Constructor called");
    }
    function decrement(uint64 inp) public returns (uint64){
        inp = sesa; 
        return inp;
    }
}

, and use wasm2wat to view the compiled contract you get:

(module
  (type $t0 (func (param i64 i64 i64 i64) (result i64)))
  (type $t1 (func (param i64 i64) (result i64)))
  (type $t2 (func (param i64 i64 i64) (result i64)))
  (type $t3 (func (result i64)))
  (type $t4 (func (param i64) (result i64)))
  (import "x" "_" (func $x._ (type $t0))) # Print in Soroban
  (import "l" "1" (func $l.1 (type $t1))) 
  (import "l" "_" (func $l._ (type $t2)))  # Storage set (initialize) in Soroban
  (func $__unnamed_1 (export "__unnamed_1") (type $t3) (result i64) # THIS IS THE CONSTRUCTOR
    (local $l0 i64)
    (drop
      (call $x._ # this is a call to print in soroban
        (local.tee $l0
          (i64.or
            (i64.shl
              (i64.extend_i32_u
                (i32.const 1024))
              (i64.const 32))
            (i64.const 4)))
        (i64.const 77309411332)
        (local.get $l0)
        (i64.const 4)))
    (i64.const 2))
  (func $decrement (export "decrement") (type $t4) (param $p0 i64) (result i64)
    (local $l1 i64)
    (local.set $l1
      (call $l.1
        (i64.const 0)
        (i64.const 2)))
    (block $B0
      (br_if $B0
        (i32.const 0))
      (return
        (i64.or
          (i64.shl
            (local.get $l1)
            (i64.const 8))
          (i64.const 6))))
    (drop
      (call $x._
        (local.tee $l1
          (i64.or
            (i64.shl
              (i64.extend_i32_u
                (i32.const 1042))
              (i64.const 32))
            (i64.const 4)))
        (i64.const 64424509444)
        (local.get $l1)
        (i64.const 4)))
    (unreachable))
  (func $init (export "init") (type $t3) (result i64)
    (drop
      (call $l._ # this is a call to `storage_set` in soroban
        (i64.const 0)
        (i64.const 0)
        (i64.const 2)))
    (i64.const 2))
  (memory $memory (export "memory") 2)
  (global $__stack_pointer (mut i32) (i32.const 1048576))
  (data $.rodata (i32.const 1024) "Constructor calledmath overflow,\0a"))

U see that the constructor is named unnamed and it contains some logic for doing the print in Soroban. U see the function init doing some logic to initialize data (it is supposed to call storage_initializer, but llvm optimized this and inlined the function call in place not allocate another stack frame)

What needs to be done is two things: 1- function unnamed to be called __constructor, so the the soroban env can recognize it is indeed the constructor and call it upon deployment. 2- insert a call to storage_initalizer before executing the logic.

Does that answer your question?

Nov 13 '24 23:11 salaheldinsoliman

Thanks for mentioning wasm2wat—it really helps with debugging.

U see that the constructor is named unnamed and it contains some logic for doing the print in Soroban. U see the function init doing some logic to initialize data

So, it's safe to say that both __unnamed_1 and storage_initializertogether form the constructor, which we should re-export as __constructor?

Also, is the function __unnamed_1 a standard function that’s exported with the same name each time? Meaning, can I do..

let constructor = binary.module.get_function("__unnamed_1").unwrap();

?

function unnamed to be called __constructor, so the the soroban env can recognize it is indeed the constructor and call it upon deployment.

I'm not sure I fully understand this part. If a user defines their own version of __constructor and we prepend __unnamed_1 and storage_initializer to their instructions, then the Soroban VM wouldn’t be able to call the constructor upon deployment. This is because the developer’s constructor might have arguments that they would need to manually pass.

Something related that I found while looking into how this is implemented in rs-soroban-sdk is that we’re currently using register_contract_wasm here. I think we should make use of register_contract_wasm_with_constructor, but I’m not sure how that would work in Solang.

Nov 18 '24 22:11 tareknaser