graph-node
graph-node copied to clipboard
early false positive filter
Could the issue be better described, is this a correctness fix or an optimization? If it's correctness I'm surprised that match_and_decode
would somehow be giving false positives.
The issue is that if match_and_decode filters every trigger out, an "empty" POI is created because the main loop always assumes there is a valid set of triggers here
So this is both a fix (we don't process anything if there are no triggers) and an optimization because we avoid a lot of processing by moving this check to earlier point in the flow.
Thanks for the explanation, please wait for a review from me before merging.
On the correctness fix, the question is why is a PoI being written in blocks that processed no triggers. I believe this is due to this start_handler
call:
https://github.com/graphprotocol/graph-node/blob/4850b828f9257a7725758e5dc3d168b39305a790/core/src/subgraph/trigger_processor.rs#L37-L39
Which is called before match_and_decode
so it may be called even if no handler is actually started. This was added in https://github.com/graphprotocol/graph-node/pull/2748, and I'm unsure of the rationale, but regardless we need to be compatible with it.
I believe the most minimal fix would be to call start_handler
before processing the trigger on the first host that matches.
The main issue is the assumption that any block then went through filtering will have at least a single valid trigger. There might be other places with this assumption, so IMV my fix is correct because it re-introduces that assumption. The only optimisation here is doing it earlier than the hosts and that is correct as well IMO because that's how the normal flow behaves. So having a minimal duplication (which is not actual code duplication) is a small price to pay.
Fixing the issue as you're proposing my expose other places where the assumption exist (and should exist) I have seen similar issues while working on the new substreams processing.
The main issue is the assumption that any block then went through filtering will have at least a single valid trigger.
True this bug exposes this accidental assumption, but the intended design is for block stream filtering to allow false positives, even if all triggers in the block are false positives. So we should identify and fix all places where this incorrect assumption is being made.