go-ethereum
go-ethereum copied to clipboard
Allow `evm t8n` to generate traces to stdout or to a known filename
Rationale
This is about the evm t8n
tool.
EVM traces can be huge; some of the Common Tests generate multi-GB traces. This complicates logistics, particularly if one is starting work on the trace and only the beginning of the trace would be enough.
It would be useful to be able to pipe the traces out of evm t8n
, either from stdout (in a clean JSONL way) or from a named pipe. That way, another program can consume the pipe in typical unix fashion and block/kill the producer as needed, without needing to run the whole test or store the huge trace.
Currently this can't be done because evm t8n --trace
only writes to a file, and the file's name is the transaction hash; so we don't know what filename to use in mkfifo
.
Implementation
Having an option to choose the trace filename would fix this. If multiple JSONL files are needed because there are multiple transactions, then maybe adding a counter might be enough. (filename1.jsonl, filename2.jsonl, etc)
Are you aware that exactly this is already supported, via evm --json statetest <statetest.json>
? The inputs for t8n
and statetest
are not identical, but pretty close. The statetest
feature of evm
, and the line-by-line streaming json output is used by evm-fuzzers to fuzz against nethermind / besu etc.
That's what I tried, and the stderr in statetest
is close to what I want, but it doesn't allow me to select a fork, a d,g,v value, etc. So I'd have to implement quite a bit of logic to deal with that.
So I thought to use t8n
instead and drive it from retesteth. But then I lose the stderr traces.
but it doesn't allow me to select a fork, a d,g,v value, etc.
You can do it like this example from goevmlab: https://github.com/holiman/goevmlab/blob/master/evms/testdata/statetest1.json . Make the test contain one test only. Set the d,g,v
as you desire. Set a post
section for only the fork you are interested in. The hash can be set to 0x0000000000000000000000000000000000000000000000000000000000000000
-- the client will say "oh no test error, stateroot wrong" but you can ignore that.
I mean, I'm fine with making t8n
support a more flexible trace output, sure, just want to let you know about the alternatives :)
Thank you for the idea, but that sounds rather like going in the direction I'd like to avoid: do custom processing to most tests (with attendant format interpretation, tracking changes, etc) instead of ... just using them to generate traces.
I am up for t8n tool to support ranged vmtrace output.
If multiple JSONL files are needed because there are multiple transactions, then maybe adding a counter might be enough. (filename1.jsonl, filename2.jsonl, etc)
This is a good suggestion. I'll add a switch for that. We'll play with it a bit
I've tested this a bit now. So, if I just do mkfifo /tmp/mypipe
, then listen while also executing the evm t8n
, then I get the output:
[user@work tmp]$ cat < mypipe
{"pc":0,"op":96,"gas":"0x5f58ef8","gasCost":"0x3","memSize":0,"stack":[],"depth":1,"refund":0,"opName":"PUSH1"}
{"pc":2,"op":64,"gas":"0x5f58ef5","gasCost":"0x14","memSize":0,"stack":["0x1"],"depth":1,"refund":0,"opName":"BLOCKHASH"}
{"pc":3,"op":0,"gas":"0x5f58ee1","gasCost":"0x0","memSize":0,"stack":["0xdac58aa524e50956d0c0bae7f3f8bb9d35381365d07804dd5b48a5a297c06af4"],"depth":1,"refund":0,"opName":"STOP"}
{"output":"","gasUsed":"0x17","time":100979}
(In this example, I modded the lookup so it "opened" /tmp/mypipe
instead of trace-<index>-<hash>.jsonl
).
Since the filenames are possible to figure out in advance, is there anything that needs to be done here?
Sorry for the late answer, I went on a long holiday and I'm slowly coming back to context...
I don't understand the "I modded the lookup" line, what do you mean?
As for the filenames being "possible to figure out in advance", what is the advantage of having the hash in the name? I'd say it only makes life difficult for the consumer.
Anyway, I guess I can just search/poll the directory for entries with the expected initial substring and ignore the hash.
Sorry, no, I didn't think it through. Having to calculate the hash to know what filename to expect was the actual problem I described when I opened this issue.
I would like to drive 'evm t8n' t8n from retesteth while catching the traces into a pipe. Retesteth allows me to run tests without knowing almost anything about them - just the name is enough. Until here, great. But now, if I want to prepare a pipe, I need to parse the test, decide the test case inside that applies, deal with RLP and hashing...
I don't understand the "I modded the lookup" line, what do you mean?
I meant that when I first created the file "from the outside", via mkfifo
in my case, then t8n
happily wrote into the file (which was in fact a pipe). So I could conclude that it is possible to pre-create pipes and have t8n write into them.
So instead of having trace-<index>-<hash>.jsonl
, doing trace-<index>.jsonl
, that would solve your case?
So instead of having trace-
- .jsonl, doing trace- .jsonl, that would solve your case?
Still, seems messy to me. So you would figure out how many pipes you need, pre-create them then run t8n, and read the trace-output while it's executing.
Why not just run t8n, and then slurp up the files it spat out, using a glob-pattern for trace-*
. ?
Why not just run t8n, and then slurp up the files it spat out, using a glob-pattern for trace-*. ?
Because that means waiting until t8n finishes running. But my use case is that I want to stop the run the moment I detect a problem. Some traces are many GB long, so waiting to the end is a waste of time and space.
So you would figure out how many pipes you need, pre-create them then run t8n, and read the trace-output while it's executing.
No need to figure the number. It's easy to create 10 fifos with the potential names (transaction-{1..10}.jsonl), use them as they come alive and delete them all at the end.
But worth noting, there's less than 20 tests containing more than 1 transaction in 1 block, vs thousands of tests with a single transaction per block. So dealing with a single trace is the main case to think of.
Just in case, I thought I'd qualify that those "less than 20 tests" is when considering the BlockchainTests suite of the Common tests, for the fork Berlin, but ignoring bcMultiChainTest and bcTotalDifficultyTest.