go-ethereum icon indicating copy to clipboard operation
go-ethereum copied to clipboard

Allow `evm t8n` to generate traces to stdout or to a known filename

Open hmijail opened this issue 2 years ago • 7 comments

Rationale

This is about the evm t8n tool.

EVM traces can be huge; some of the Common Tests generate multi-GB traces. This complicates logistics, particularly if one is starting work on the trace and only the beginning of the trace would be enough.

It would be useful to be able to pipe the traces out of evm t8n, either from stdout (in a clean JSONL way) or from a named pipe. That way, another program can consume the pipe in typical unix fashion and block/kill the producer as needed, without needing to run the whole test or store the huge trace.

Currently this can't be done because evm t8n --trace only writes to a file, and the file's name is the transaction hash; so we don't know what filename to use in mkfifo.

Implementation

Having an option to choose the trace filename would fix this. If multiple JSONL files are needed because there are multiple transactions, then maybe adding a counter might be enough. (filename1.jsonl, filename2.jsonl, etc)

hmijail avatar Aug 18 '22 11:08 hmijail

Are you aware that exactly this is already supported, via evm --json statetest <statetest.json> ? The inputs for t8n and statetest are not identical, but pretty close. The statetest feature of evm, and the line-by-line streaming json output is used by evm-fuzzers to fuzz against nethermind / besu etc.

holiman avatar Aug 18 '22 12:08 holiman

That's what I tried, and the stderr in statetest is close to what I want, but it doesn't allow me to select a fork, a d,g,v value, etc. So I'd have to implement quite a bit of logic to deal with that.

So I thought to use t8n instead and drive it from retesteth. But then I lose the stderr traces.

hmijail avatar Aug 18 '22 12:08 hmijail

but it doesn't allow me to select a fork, a d,g,v value, etc.

You can do it like this example from goevmlab: https://github.com/holiman/goevmlab/blob/master/evms/testdata/statetest1.json . Make the test contain one test only. Set the d,g,v as you desire. Set a post section for only the fork you are interested in. The hash can be set to 0x0000000000000000000000000000000000000000000000000000000000000000 -- the client will say "oh no test error, stateroot wrong" but you can ignore that.

holiman avatar Aug 18 '22 13:08 holiman

I mean, I'm fine with making t8n support a more flexible trace output, sure, just want to let you know about the alternatives :)

holiman avatar Aug 18 '22 13:08 holiman

Thank you for the idea, but that sounds rather like going in the direction I'd like to avoid: do custom processing to most tests (with attendant format interpretation, tracking changes, etc) instead of ... just using them to generate traces.

hmijail avatar Aug 18 '22 13:08 hmijail

I am up for t8n tool to support ranged vmtrace output.

winsvega avatar Aug 18 '22 19:08 winsvega

If multiple JSONL files are needed because there are multiple transactions, then maybe adding a counter might be enough. (filename1.jsonl, filename2.jsonl, etc)

This is a good suggestion. I'll add a switch for that. We'll play with it a bit

holiman avatar Sep 01 '22 08:09 holiman

I've tested this a bit now. So, if I just do mkfifo /tmp/mypipe, then listen while also executing the evm t8n, then I get the output:

[user@work tmp]$ cat < mypipe
{"pc":0,"op":96,"gas":"0x5f58ef8","gasCost":"0x3","memSize":0,"stack":[],"depth":1,"refund":0,"opName":"PUSH1"}
{"pc":2,"op":64,"gas":"0x5f58ef5","gasCost":"0x14","memSize":0,"stack":["0x1"],"depth":1,"refund":0,"opName":"BLOCKHASH"}
{"pc":3,"op":0,"gas":"0x5f58ee1","gasCost":"0x0","memSize":0,"stack":["0xdac58aa524e50956d0c0bae7f3f8bb9d35381365d07804dd5b48a5a297c06af4"],"depth":1,"refund":0,"opName":"STOP"}
{"output":"","gasUsed":"0x17","time":100979}

(In this example, I modded the lookup so it "opened" /tmp/mypipe instead of trace-<index>-<hash>.jsonl).

Since the filenames are possible to figure out in advance, is there anything that needs to be done here?

holiman avatar Nov 16 '22 09:11 holiman

Sorry for the late answer, I went on a long holiday and I'm slowly coming back to context...

I don't understand the "I modded the lookup" line, what do you mean?

As for the filenames being "possible to figure out in advance", what is the advantage of having the hash in the name? I'd say it only makes life difficult for the consumer.

hmijail avatar Dec 15 '22 14:12 hmijail

Anyway, I guess I can just search/poll the directory for entries with the expected initial substring and ignore the hash. 

hmijail avatar Dec 15 '22 14:12 hmijail

Sorry, no, I didn't think it through. Having to calculate the hash to know what filename to expect was the actual problem I described when I opened this issue.

I would like to drive 'evm t8n' t8n from retesteth while catching the traces into a pipe. Retesteth allows me to run tests without knowing almost anything about them - just the name is enough. Until here, great. But now, if I want to prepare a pipe, I need to parse the test, decide the test case inside that applies, deal with RLP and hashing...

hmijail avatar Dec 15 '22 22:12 hmijail

I don't understand the "I modded the lookup" line, what do you mean?

I meant that when I first created the file "from the outside", via mkfifo in my case, then t8n happily wrote into the file (which was in fact a pipe). So I could conclude that it is possible to pre-create pipes and have t8n write into them.

So instead of having trace-<index>-<hash>.jsonl, doing trace-<index>.jsonl, that would solve your case?

holiman avatar Dec 20 '22 14:12 holiman

So instead of having trace--.jsonl, doing trace-.jsonl, that would solve your case?

Still, seems messy to me. So you would figure out how many pipes you need, pre-create them then run t8n, and read the trace-output while it's executing.

Why not just run t8n, and then slurp up the files it spat out, using a glob-pattern for trace-*. ?

holiman avatar Dec 20 '22 14:12 holiman

Why not just run t8n, and then slurp up the files it spat out, using a glob-pattern for trace-*. ?

Because that means waiting until t8n finishes running. But my use case is that I want to stop the run the moment I detect a problem. Some traces are many GB long, so waiting to the end is a waste of time and space.

So you would figure out how many pipes you need, pre-create them then run t8n, and read the trace-output while it's executing.

No need to figure the number. It's easy to create 10 fifos with the potential names (transaction-{1..10}.jsonl), use them as they come alive and delete them all at the end.

But worth noting, there's less than 20 tests containing more than 1 transaction in 1 block, vs thousands of tests with a single transaction per block. So dealing with a single trace is the main case to think of.

hmijail avatar Dec 20 '22 23:12 hmijail

Just in case, I thought I'd qualify that those "less than 20 tests" is when considering the BlockchainTests suite of the Common tests, for the fork Berlin, but ignoring bcMultiChainTest and bcTotalDifficultyTest.

hmijail avatar Dec 21 '22 03:12 hmijail