graph-node icon indicating copy to clipboard operation
graph-node copied to clipboard

Feature Request: Support indexed argument filtering

Open dekz opened this issue 4 years ago • 5 comments

Do you want to request a feature or report a bug? Feature

What is the current behavior? Currently TheGraph pulls all logs matching the specified topic0. In certain cases, specifically ours, we do not need to process every event, but a subset of these events.

For example, picture a subgraph which wishes to process a subset of Uniswap Swap events which only pertain to their dApp. There are many Swap events in almost every block, though not all of these are relevant to the dApp.

We would like to make use of the additional topics to specify the additional indexed fields, allowing us to skip over events which do not relate to our dApp. This would significantly save on syncing time and reduce excess processing by nodes.

Currently we are restricted to the following:

          eventHandlers:
              - event: Swap(indexed address,uint256,uint256,uint256,uint256,indexed address)
                handler: handleGenericSwap

We would like the ability to add additional filtering to the declaration:

Single exact filter

          eventHandlers:
              - event: Swap(indexed address,uint256,uint256,uint256,uint256,indexed address)
                topic1: "0xabcd...." # filter only events where the sender is the 0xabcd address
                handler: handleSwap

Or filter on the first indexed argument

          eventHandlers:
              - event: Swap(indexed address,uint256,uint256,uint256,uint256,indexed address)
                topic1: ["0xabcd....", "0x1234"] # filter only events where the sender is the 0xabcd or "0x1234" address
                handler: handleSwap

Or filter on the second indexed argument

          eventHandlers:
              - event: Swap(indexed address,uint256,uint256,uint256,uint256,indexed address)
                topic2: ["0xabcd....", "0x1234"] # filter only events where the recipient is the 0xabcd or "0x1234" address
                handler: handleSwap

I'll leave a reference to how an ethereum node is supposed to respect the topics argument in the RPC request.

A transaction with a log with topics [A, B] will be matched by the following topic filters:
  [] “anything”
  [A] “A in first position (and anything after)”
  [null, B] “anything in first position AND B in second position (and anything after)”
  [A, B] “A in first position AND B in second position (and anything after)”
  [[A, B], [A, B]] “(A OR B) in first position AND (A OR B) in second position (and anything after)”

This change would allow us to sync our subgraph much quicker and reduce the overhead of processing unrelated (to our purposes) events. As an example, it currently takes us a solid 24 hours to process our subgraph and as a result we only store approximately 100,000 entities.

dekz avatar Dec 11 '20 05:12 dekz

I've implemented something similar in https://github.com/hyperledger/burrow/blob/main/docs/reference/vent.md. We use this peg grammar: https://github.com/hyperledger/burrow/blob/main/event/query/query.peg so you can form simple boolean queries and containment and the like which can be useful if you want more than an exact match.

Now I am defecting to thegraph myself, I would be interested in implementing this here.

[off-topic: I notice that event sigs include indexed in subgraph.yaml (as they ought to: https://github.com/ethereum/solidity/issues/4168), I assume there is no magic that is able disambiguate events that collide on signature? You can count the number of topics but as far as I am aware there is no way to distinguish between multiple sigs that share a selector, e.g.

Foo(bytes32 indexed bar, uint256 baz)
Foo(bytes32 bar, uint256 indexed baz)

]

silasdavis avatar Nov 05 '21 17:11 silasdavis

I vaguely remember that the reason that indexed was introduced was that when a type with a variable size such as a string is indexed, the topic is the hash of the string. Thus many years ago, when we first encountered this situation, we realized we had to add in indexed so that if needed, we could get all the right topics. This is straight forward.

Whether or not the subgraph checks for event collisions, I do not know. My assumption is no but one of the graph-node devs would know better. I personally never came across this scenario.

davekay100 avatar Nov 06 '21 01:11 davekay100

Hi! This definitely makes sense (filtering on more topics). Graph Node is currently being upgraded to ingest via a Firehose, after which we will have a lot more flexibility to add more such filters

azf20 avatar Nov 11 '21 10:11 azf20

And there is indeed no way to disambiguate between multiple sigs with the same selector

azf20 avatar Nov 11 '21 10:11 azf20

Bumping this, filter on topics would be very helpful for events which are emitted in large quantities, who's handlers 99% of the time will return if one of the topics is not equal to a value (for example a swap here).

This is killing indexing time of our 0x community subgraphs: https://github.com/papercliplabs/zero-ex-subgraph/blob/main/src/mappings/uniswapV3.ts#L21

spennyp avatar Feb 02 '24 19:02 spennyp