sei-chain icon indicating copy to clipboard operation
sei-chain copied to clipboard

Firehose EVM tracer addition

Open maoueh opened this issue 1 year ago • 0 comments

Initial commit to see the general idea of how Firehose tracer is instantiated and configured

Starter

Important This is a WIP PR that I push now so that you guys can see how it looks like and we then we can discuss concrete solutions to the questions I still have unanswered so far.

Here what I see as important decision points I took and open questions I have that will shape the future PR(s).

Ethereum core.BlockchainLogger vs x/evm/tracers#BlockchainLogger

The standard Geth has core.BlockchainLogger which acts as a "live" tracer that traces the full block execution in addition to the EVM and State tracing.

However, the core.BlockchainLogger is made for Ethereum Block model in a way that is a bit too rigid to fit in Sei model, for example the OnBlockStart receives a *types.Block which is too hard to replicate in Sei model (if it was receiving types.Header, we could maybe have retrofit).

Furthermore, the BlockchainLogger direct methods (OnBlockStart, OnBlockEnd, etc.) are used in core/blockchain.go, execution point that is not present in Sei. So it was a bit moot to try to use core.BlockchainLogger in Sei directly due to required changes to the interface that would be needed.

We could have follow that route, e.g. changing the core.BlockchainLogger interface to fit Sei, but I judged it was better to limit the changes in go-ethereum and instead define our own version of BlockchainLogger.

For now I kept the same name and its semantics fits only for Ethereum transactions, so right now it's not a general Sei tracer.

I also prefix all methods of this interface with OnSei..., the reason for that being that it would enable one to implement both interface from a single struct, something I see could be a possibility for our FirehoseTracer.

Open questions:

  • Right now the x/evm/tracers#BlockchainLogger is EVM aware only, should it become a general tracer for Sei with possibility to trace all type of transactions, including EVM? I think this discussion can be post-pone. First, a survey of Cosmos tracing world would be needed as we would probably want to follow current standard. Second, we could want to have this as a general Cosmos/Tendermint piece instead.

Tracers Directory and Activation

Right now I enforced the instantiation of the FirehoseTracer straight in app.go. The Geth tracer PR introduces a registry were standard tracers are added to. The node operator can then activate the tracing by passing the flag geth ... --vmtrace=<tracer-id>, e.g. geth ... --vmtrace=firehose.

I plan to offer a similar experience in Sei, so I will add some kind of registry for the tracers + func init() registration. We could think of adding a DebuggingTracer that would print every tracing entry points.

If you could give me some hints on the names you would like to use, I'll prepare something.

FirehoseTracer & Inclusion

The FirehoseTracer struct is our implementation of x/evm/tracers@BlockchainLogger. How it works is relatively simple. When a block start, we instantiate our Ethereum Block Model struct which is a Protobuf generated Go struct, definitions can be seen at https://buf.build/streamingfast/firehose-ethereum/docs/main:sf.ethereum.type.v2#sf.ethereum.type.v2.Block.

As the tracer is invoked (CaptureTxStart, CaptureStart, etc..) we decoded and accumulated all information in our block model. CaptureTxStart lead to a new active TransactionTracer which contains Call[] each call recording the various state changes (Log, Balance, Nonce, etc.)

On OnBlockEnd, the block is "completed", we serialize it to bytes then to base64 and we then emit in text format on stdout file descriptor the line FIRE BLOCK <blocks's metadata> <final_block_ref> <bytes_base64_payload>. Our fireeth binary manages the seid start process reading its stdout pipe and blocks progress through our code at this point.

I would like to have FirehoseTracer directly builtin in sei-chain. The main reason is to have an easy way for operators to operator Firehose node without having to download a forked binary that contains the Firehose interface. Morevover, the Firehose output format is relatively simply to parse and can be used in any language that supports Protobuf so external system could benefits from it.

Parallelism, linearity

Right now the tracer in app.go#ProcessBlock is re-instantiated on each block as I noticed ProcessBlock is called within a goroutine which leads me to think that there is a possibility that 2 or more ProcessBlock could run concurrently.

Firehose needs to emit the block by locking the pipe to ensure that a single line is fully emitted before the next one, we wouldn't want concurrent write to stdout pipe.

I'm unsure where to ensure that linearity and exclusiveness is respected. For example if there is indeed 2 concurrents ProcessBlock running, where can I trigger the "emit" sequentially?

Finality

Tony mentioned that ProcessBlock could be called from two code paths one final the other not final. Firehose handles forks without a problem but needs finality information about the block, we need to know which parent(s) block is final relative to the current block.

The idea is in pseudo-code something like this:

ProcessBlock(block Block):
  OnBlockStart(block, findFirstFinalBlockStartingFrom(block))
  ...

Chain(s) usually have some kind of deterministic way to determine the final block. In case of Sei with its instant finality and in regards to ProcessBlock, we have multiple avenue:

  • Maybe the final/non-final execution is true only for a miner node and is not relevant for a full node that is simply syncing with the network. If it's the case in Sei, then I would simply use the current block as being final (and add a sanity check if the block is non-final and a tracer is active).
  • The tracer is only called if the execution of ProcessBlock is final, would need to see where I can get that information.
  • We determine what is the right version of findFirstFinalBlockStartingFrom(block) for Sei chain and I use that

RunPrecompiledContract and RunAndCalculateGas

In RunPrecompiledContract there is RunAndCalculateGas which is an early return. In RunPrecompiledContract method we track gas via OnGasChange which means we need to also instrument RunAndCalculateGas which is an interface.

What is the correct way to instrument this, this is billing some gas if I understand it right. However RunAndCalculateGas is an interface so there is multiple implementation, they all must be aware of the tracer.

Sei Address Association

Sei has address association from sei <=> EVM based on the fact the the underlying private key is secp256k1 but the public keys is mapped differently. Is this tracked in some EVM contract/pre-compiled or is it kept in Sei kvdb? Should we track this somehow?

State Snapshot

Our standard Ethereum implementation records the genesis block with the genesis balances, code and storage for the various genesis account. Now it appears that when Sei upgrade to EVM support, the current usei balance is carried over to EVM.

I see the Firehose tracer as starting from the very block that will enable EVM meaning state prior that point will not be known to Firehose user. This is a problem for advanced technology that are built on the fact that the tracer keeps sees the full state of the node including genesis data.

One possibility I see here is that x/evm/types/BlockchainLogger get an extra OnEVMGenesisState. Now, the exact interface of how to query the state would need to be defined.

I sense of the amount of data we talk about will drive the decision of such API. Indeed, if you tell me there is currently 10M Sei users, maybe holding it full in memory is not the best choice.

Would need to know also if there is other state that caries over. I saw some ERC20 likes stuff and I think some mapping could be in the plan so I imagine here also the initial state would be kept in Sei.

Other Changes

Outside of the RunAndCalculateGas, in this vein of tracking any state changes happening around the EVM block execution, do you guys see other things that are particular to Sei but fiddles someone with the EVM known state?

maoueh avatar Feb 10 '24 14:02 maoueh