ref-fvm icon indicating copy to clipboard operation
ref-fvm copied to clipboard

EVM runtime: Implement DELEGATECALL

Open karim-agha opened this issue 2 years ago • 15 comments

Running an EVM bytecode in the context of another actor's state CID.

karim-agha avatar Jun 14 '22 15:06 karim-agha

This is essential for proxy contracts (80% of all serious contracts)

karim-agha avatar Jun 14 '22 15:06 karim-agha

~We'll need to support DELEGATECALL to non-EVM contracts too.~

raulk avatar Jun 15 '22 00:06 raulk

It's more running another actor's byte code in the context of our actor (state CID, balance, identity, etc). The problem is that the target actor also needs to be able to send as us (otherwise, we could implement this in the EVM shim itself).

So yeah. We're probably going to need a new delegate_call syscall.

We could also break it up into two syscalls:

  1. let cid = actor::get_actor_code_cid(...).
  2. dynamic_call(cid, message).

That would let us dynamically call deployed code whether or not it's deployed as part of an actor.

Stebalien avatar Jun 16 '22 00:06 Stebalien

IMO, we should implement CALLCODE in terms of DELEGATECALL.

Stebalien avatar Jun 16 '22 00:06 Stebalien

So, there's an alternative that may be more performant (in this case):

  1. Add a "GetCode" method to the EVM actor that returns a reference to the EVM actor's bytecode.
  2. Load that code into the running EVM.

This means we won't need to make any changes to the FVM itself.

Stebalien avatar Aug 04 '22 00:08 Stebalien

@Stebalien That works for me. I will actually withdraw my comment above about having to support DELEGATECALL to non-EVM contracts. I think this creates some rough edges (not all targets support DELEGATECALL), but it's fine for M2.1 since it makes zero sense to DELEGATECALL built-in actors. We may want to revisit in M2.2 to provide a uniform feature here.

raulk avatar Aug 04 '22 09:08 raulk

@Stebalien There are just some concerns I have about memory isolation in this case. With DELEGATECALL, the callee doesn't get access to the caller's memory. We gotta be careful to preserve that guarantee.

raulk avatar Aug 04 '22 09:08 raulk

the callee doesn't get access to the caller's memory

Yeah. In this case, we should be fine as we'll be executing within the EVM.

Stebalien avatar Aug 04 '22 16:08 Stebalien

I actually meant the EVM memory of the caller!

raulk avatar Aug 04 '22 16:08 raulk

I actually meant the EVM memory of the caller!

Ah, yes. We'll need to create a new EVM container per DELEGATECALL.

Stebalien avatar Aug 04 '22 16:08 Stebalien

Ok, I was wrong here. DELEGATECALL and CALLCODE take a gas limit, which means we need to be able to stop execution when we reach some amount of Filecoin gas. That means this code must run in a different invocation container.

IMO, our only real option here is to recursively call into ourselves adding a new invoke_with_code(other_actor_addr, params, invocation_context, etc...) method to the EVM actor.

When called, this method would:

  1. Validate that the caller is self.
  2. Fetch the code from other_actor_addr.
  3. Mock the invocation context (specifically, the "ultimate" caller, value, etc).
  4. Execute the other actor's bytecode instead of this actor's evm bytecode.

We'll have to very carefully filter all calls from CALLACTOR to ensure they don't try to call invoke_with_code on ourselves. Really, we might just want to forbid self CALLACTOR calls.

Stebalien avatar Sep 15 '22 23:09 Stebalien

Makes sense, but that would mean the caller needs to save its pre-call state to the blockstore so that the new invocation container can "see" it?

Note to self: EVM memory is not shared when DELEGATECALLing.

raulk avatar Sep 15 '22 23:09 raulk

Makes sense, but that would mean the caller needs to save its pre-call state to the blockstore so that the new invocation container can "see" it?

Unfortunately, yes.

Stebalien avatar Sep 15 '22 23:09 Stebalien

Did you consider sending the root CID as a param (ie. sending the intermediate sub DAG)? Doing so would save a few syscalls. It might be a bit delicate because we don't have proper reachability analysis yet, but it may not entail any risk at this stage either?

raulk avatar Sep 15 '22 23:09 raulk

Ah, you mean the root state CID? You're right, we could do that. We'd need to flush the HAMT (write the blocks from memory into the FVM), but we wouldn't need to call set_root (wouldn't need to pay the costs for persisting).

Honestly, I don't think this is something we should spend any time optimizing. As far as I know, most contracts that use DELEGATECALL:

  1. Lookup the target address in state.
  2. Forward the message to the target address.
  3. Return the result.

I.e., they rarely write to their own state.

Stebalien avatar Sep 15 '22 23:09 Stebalien