elrond-wasm-rs icon indicating copy to clipboard operation
elrond-wasm-rs copied to clipboard

no bigInt under the given handle

Open gfusee opened this issue 2 years ago • 4 comments

Hi! I've got an error "no bigInt under the given handle" when running a contract with erdpy contract test command, but everything is ok when testing with Rust Testing Framework.

It seems to be a compiler issue because the error disappears when I delete some code which have nothing to do with the function that produce the error.

I created a repo in order to reproduce the error.

https://github.com/gfusee/no-bigint-handle-bug-reproduction

gfusee avatar Jun 20 '22 15:06 gfusee

the issue is still there in elrond-wasm-rs 0.33.0

gfusee avatar Jun 20 '22 20:06 gfusee

the issue is still there in elrond-wasm-rs 0.33.1

gfusee avatar Jun 27 '22 08:06 gfusee

Hi! This is a bug in the Rust to Wasm compiler. What happens is that a function argument gets passed incorrectly.

In this case, arg4 ends up with the wrong value. A BigUint is really just a wrapped handle, the value -105 is supposed to be passed, but the value 1 or 0 appears in the function, depending on the compiler optimization configuration. This is the reason it seems so erratic

It's not the first time we encounter this behavior, athough it is rare. It is difficult to reproduce, so we're very glad that you did here. There is a workaround of regrouping the arguments and/or placing them in a structure. The proper solution would be to strip it down even further and try to issue an issue to the compiler team.

andrei-marinica avatar Jul 04 '22 14:07 andrei-marinica

I see! indeed this is really strange that 1 or 0 is passed since every handles above -100 are reserved for arwen vm Thank you for the workaround!

gfusee avatar Jul 25 '22 19:07 gfusee

We are still investigating this issue, which is related to the compiler and to Wasmer. A workaround that sometimes works is setting opt-level = 3 instead of opt-level = "z" under [profile.release] in the Cargo.toml file of the wasm crate. Thank you for the provided codebase, it was of great help during the investigation!

andrei-marinica avatar Mar 16 '23 07:03 andrei-marinica

We also discovered this issue while auditing a project. There was no logical way for the project to fix this issue. They fixed it by just randomly shuffling the variables.

This issue is critical as projects have no logical way to prevent the issue (only chance), can't catch the issue at compile time or with Rust tests, and if issue is encountered for the first time on mainnet, it can have very bad consequences (e.g. if it happens in the callback of a liquid staking protocol).

Here is the smallest smart contract that contains the issue:

#![no_std]

multiversx_sc::imports!();

#[multiversx_sc::contract]
pub trait Contract {
    #[init]
    fn init(&self) {
        self.f(
            BigUint::zero(),
            BigUint::zero(),
            BigUint::zero(),
            BigUint::zero(),
            BigUint::zero(),
            0,
        );
        self.f(
            BigUint::zero(),
            BigUint::zero(),
            BigUint::zero(),
            BigUint::zero(),
            BigUint::zero(),
            1,
        );
    }

    fn f(
        &self,
        arg1: BigUint,
        arg2: BigUint,
        arg3: BigUint,
        arg4: BigUint,
        arg5: BigUint,
        arg6: u64,
    ) {
        let _ = arg1 == 0;
        let _ = arg2 == 0;
        let _ = arg3 == 0;
        let _ = arg4 == 0;
        let _ = arg5 == 0;
        let _ = BigUint::from(arg6) == 0;
    }
}

This smart contract can't be deployed. Deployment will fail with "No bigInt under the given handle" error. Deployment transaction that failed: d44d095793f29b54ea3942b9617c98848d31cbf3c1859229cf1a4db2c042b7b3

Repository to deploy this smart contract and see the failure: https://github.com/lcswillems/no-bigint-hangle-bug-contract

lcswillems avatar Aug 24 '23 19:08 lcswillems

just tried your contract @lcswillems and it doesn't deploy even on mandos

setting opt-level to any value in [0, 1, 2, 3, "z"] doesn't solve the issue

@lcswillems the good point here is that you can catch the issue using mandos tests instead of Rust ones since mandos use the same runtime as the VM

gfusee avatar Aug 26 '23 15:08 gfusee

Indeed, we can catch with Mandos. The only thing is that because there is no logical way to know when the bug appears, we cannot expect to have Mandos tests that cover the issue.

lcswillems avatar Aug 26 '23 17:08 lcswillems

This issue was reproduced by us and the community several times.

It was caused by Wasmer 1. Now, after the latest mainnet release, Wasmer 1 is no longer in use, and this issue does not reroduce anymore.

andrei-marinica avatar Jan 23 '24 13:01 andrei-marinica

That's great news!! :fire: :fire:

lcswillems avatar Jan 23 '24 13:01 lcswillems