ref-fvm
ref-fvm copied to clipboard
Generics and transmutation make replacing Kernel very difficult
@alexytsu and I are improving tooling for benchmarking and profiling native actors on the FVM (see https://github.com/anorth/fvm-workbench and https://github.com/filecoin-project/builtin-actors/issues/1236). We've run into great difficulty attempting to fake out expensive VM intrinsic operations like proof verification, the CryptoOps
trait of Kernel
.
We can wrap the DefaultKernel
and delegate most methods...
pub struct BenchKernel<B: Blockstore + 'static> {
kernel: DefaultKernel<DefaultCallManager<DefaultMachine<B, FakeExterns>>>,
// ...
}
let executor = DefaultExecutor::<BenchKernel<B>>::new(EnginePool::new_default(engine_conf)?, machine)?;
... but this is only effective for the top-level call. During a sub-call, the wrapped DefaultKernel
specifies a static Self
as a type parameter to CallManger::send
. No instance reference is passed (and indeed the kernel later consumes itself).
let result = self.call_manager.with_transaction(|cm| {
cm.send::<Self>(
from, *recipient, method, params, value, gas_limit, read_only,
)
})?;
The Kernel
trait specifies a new()
constructor, and the call manager later uses that to construct a new kernel for the subcall, but it always constructs DefaultKernel
.
With reference to some code in the conformance testing package, we tried also wrapping the CallManger so as to specify our own BenchKernel
as Self
instead, also replacing with_transaction
to do a funky transmutation. This also isn't doing the job. When the DefaultCallManager
instantiates a Kernel inside send_resolved it passes itself in. Therefore, when the first Kernel is created, we get a BenchKernel<DefaultCallManager>
rather than a BenchKernel<BenchCallManager>
as needed. It looks like for the BenchCallManager
to intercept this it would need to re-implement that entire call stack
CallManager::send -> DefaultCallManager::send_unchecked -> DefaultCallManager::send_resolved
.
The conformance testing code also uses a TestMachine
, we're not yet sure if this is necessary. But big picture, this static type-parameter based structure makes it very difficult to replace the kernel. There's probably some amount of reimplementation of these components that will get us there, but that would introduce huge and fragile coupling to the internals of the FVM.
A much simpler way would be for the Kernel
type parameter to CallManager
be instead a runtime parameter of a Kernel
constructor/factory function (removing new()
from the Kernel
trait). It would then be unnecessary to wrap CallManager or anything else. This would involve following a pointer during an internal send, but much less code and complexity (less monomorphised code to build too).
@alexytsu and I will try this out to confirm it works.
I can see that adding anything dynamic here is going to be a tough slog against the current norms.
I'm having a go adding an additional method type parameter to Kernel::send in #1780
The TestMachine
does seem like a necessary part of the conformance testing set-up. It's tightly coupled to the TestKernel
in that it holds the kernel configuration data necessary to pass through when a kernel destroys itself and then is later re-created.
Though the kernel can now be successfully replaced https://github.com/filecoin-project/ref-fvm/pull/1780 without the need for replacing the DefaultExecutor or DefaultCallManager, it is still hard to provide an API for dynamic mock behaviour of the Kernel.
The kernel is not long-lived and it's construction is handled by the CallManager so we usually won't have access to inject it with fake data or instruct it to intercept/passthrough certain calls dynamically.
So, when we built this, we wanted to pass a constructor through. However, we ran into some recursive type issues. Kernel
contains a CallManager
which contains a Machine
, so the CallManager
can't be constructed with knowledge of the kernel. I.e., We can't have CallManager<Kernel<CallManager=???>>>
.
An alternative would be dynamic types. Unfortunately, due to some wasmtime restrictions, this would need to be Box<dyn ...>
(the kernel needs a static lifetime).
However dynamic types (objects) can't have generic methods. Which... may be fine, actually. The performance will be slightly worse, but none of this should be performance critical...
Hm. I dug into making this dynamic and... we probably need to make the blockstore dynamic first. We don't have to, but the types get kind of annoying if we don't.
Hm. I dug into making this dynamic and... we probably need to make the blockstore dynamic first. We don't have to, but the types get kind of annoying if we don't.
See also https://github.com/filecoin-project/builtin-actors/issues/133#issuecomment-1340101111