go-ethereum eth/traces: add state limit

This PR introduces a new mechanism in chain tracer for preventing creating too many trace states.

The workflow of chain tracer can be divided into several parts:

state creator generates trace state in a thread
state tracer retrieves the trace state and applies the tracing on top in another thread
state collector gathers all result from state tracer and stream to users

It's basically a producer-consumer model here, while if we imagine that the state producer generates states too fast, then it will lead to accumulate lots of unused states in memory. Even worse, in path-based state scheme it will only keep the latest 128 states in memory, and the newly generated state will invalidate the oldest one by marking it as stale.

The solution for fixing it is to limit the speed of state generation. If there are over 128 states un-consumed in memory, then the creation will be paused until the states are be consumed properly.

Sep 19 '22 04:09 rjl493456442

@holiman updated, please take another look

Sep 19 '22 07:09 rjl493456442

I guess this is the part I don't find obvious:> Whenever we set the flag for the next state, we will overwrite the value of these elements.

Sep 29 '22 15:09 holiman

@holiman in my example, the last two elements in used array are completely dangling(doesn't point to any state), so next time we call t.used[int(number-t.oldest)] = true by calling releaseState, these elements will be re-assigned to the new states and overwrite the original value in the array. So we don't need to explicitly clean them.

Sep 30 '22 03:09 rjl493456442

One remaining thought. So we have

one producer routine to feed all the blocks into the tracers
threads consumer routines, doing tracing.

The producer may (now) get stuck waiting for one of the consumers to release a state:

			// Make sure the state creator doesn't go too far. Too many unprocessed
			// trace state may cause the oldest state to become stale(e.g. in
			// path-based scheme).
			if err = tracker.wait(number); err != nil {
				failed = err
				break
			}

				// Tracing state is used up, queue it for de-referencing. Note the
				// state is the parent state of trace block, use block.number-1 as
				// the state number.
				tracker.releaseState(task.block.NumberU64()-1, task.release)

				// Stream the result back to the result catcher or abort on teardown
				select {
				case resCh <- task:
				case <-closed:
					return
				}

Now, what I want to figure out, is wether it's at all possible for the following to happen:

The producer is stuck waiting for a state to be returned,
The consumers react to the <-closed event, and therefore do not drain the taskCh, and does not release the states sitting in taskCh, and thus do not un-stuck the producer.

Oct 05 '22 11:10 holiman

128 max pending trace-states 8 threads

Producer feeds 8 states into the taskCh. One of the consumers is really slow, but 7 of them run along

consumer-0: 1
consumer-1: 128
consumer-2: 127
consumer-3: 126
consumer-4: 125
consumer-5: 124
consumer-6: 123
consumer-7: 122

The producer will be stuck on tracker.wait(129).

At this point, if closed is triggered: All of the consumers will invoke tracker.releaseState, and thus freeing up the producer. The producer will then try to deliver the 129 state, but fail, and release that state in it's own loop.

TLDR; All good! I don't see how a deadlock could occur here.

Oct 05 '22 11:10 holiman

If you rebase again, the ubuntu appveyor build failure will be resolved.

Oct 05 '22 11:10 holiman

The other failure is

--- FAIL: TestEth2AssembleBlockWithAnotherBlocksTxs (0.04s)
    api_test.go:121: invalid number of transactions 0 != 1

If you want to get rid of that error, then :+1: on https://github.com/ethereum/go-ethereum/pull/25918 :)

Oct 05 '22 11:10 holiman

go-ethereum go-ethereum copied to clipboard

eth/traces: add state limit

go-ethereum
go-ethereum copied to clipboard