graph-node icon indicating copy to clipboard operation
graph-node copied to clipboard

GraphQL: Add filter that only matches entities changed since a given block

Open lutter opened this issue 5 years ago • 14 comments

To support use cases where it is necessary to find out what has changed since a given block, we should offer a changed_since_block: $block filter that only matches entities that have been added or updated since $block. For completeness, it would also be good if that somehow included entities that have been deleted since $block, but it's not clear how 'this was deleted' could be indicated in the GraphQL response.

lutter avatar Aug 06 '20 15:08 lutter

Closing as a subset of https://github.com/graphprotocol/graph-node/issues/2958 (though maybe I should have elaborated this issue 🤔 )

azf20 avatar Nov 08 '21 22:11 azf20

Re-opening this, as we may actually want to tackle this more constrained use-case first

azf20 avatar Nov 27 '21 15:11 azf20

Proposed implementation:

  • Extend the where top-level filter to include a _change_block_gte field (leading underscore in order to avoid collisions)

The following would return all things as of block 1,234, which changed since block 1,000

things(block: { number: 1,234 }, where: { _change_block_gte: { number: 1,000 }) {
  id
}

This would not include deleted entities.

azf20 avatar Nov 27 '21 15:11 azf20

@lutter I am interested in if you envisaged this functionality going here, or elsewhere (as a new top-level field?) cc @dotansimha @kamilkisiela

azf20 avatar Nov 27 '21 15:11 azf20

@azf20 I started working on a PR for this. Started from the GraphQL area of the codebase. I noticed that input Block_height is already exists, I guess I can use it as where: { _block: Block_height }? This way user can query with where: { _block: { number: "XYZ" }} or where: { _block: { number_gte: "XYZ" }}?

Or should I use only _change_block_gte option under where?

dotansimha avatar Nov 29 '21 13:11 dotansimha

@dotansimha that is a very good point - I hadn't thought about the fact that Block_height already has number_gte, which fits perfectly with what we want to do. I think i would call it _change_block rather than _block to make it clear what the entities are being filtered on (the block when they last changed). What do you think @lutter? You can probably take quite a lot from the existing block top-level filter (which is different, as it defines the state for the overall query, i.e. "query the data as of block 1000", but will use a lot of the same information that will be helpful here)

azf20 avatar Nov 29 '21 13:11 azf20

@dotansimha that is a very good point - I hadn't thought about the fact that Block_height already has number_gte, which fits perfectly with what we want to do. I think i would call it _change_block rather than _block to make it clear what the entities are being filtered on (the block when they last changed). What do you think @lutter? You can probably take quite a lot from the existing block top-level filter (which is different, as it defines the state for the overall query, i.e. "query the data as of block 1000", but will use a lot of the same information that will be helpful here)

I don't think we can reuse the Block_height input type here, since it adds other filter capabilities, too, like { number: NNN } and { hash: 0xdeadbeef } which doesn't really make sense in the where clause. I think we have to add just the _change_block_gte field as a special case.

lutter avatar Nov 29 '21 19:11 lutter

It is an increase in the scope, but I think number and hash do make sense as values for the where clause, i.e.

where: {_change_block: {number: 100} }

Would get all entities which changed in that one block (excluding deletions)?

azf20 avatar Nov 30 '21 00:11 azf20

I'll start with just where: { _chane_block: { number_gte: 100 }} as separate input, and we can always extend the input type there (or, switch to use a different one) before merging. So I'll continue with that, and then we can discuss about different block filter in parallel.

dotansimha avatar Nov 30 '21 14:11 dotansimha

@azf20 @lutter @Jannis I created a PR fix adding this new filter: https://github.com/graphprotocol/graph-node/pull/3014 , do you think https://github.com/graphprotocol/graph-node/issues/2958 is also needed? (I guess it is? Because you can now filter base on that data, but you can't tell what is the actual change_block of each record found?)

dotansimha avatar Dec 08 '21 07:12 dotansimha

thanks @dotansimha - as discussed this morning I think we can wait on #2958, based on feedback on #3014 we can establish whether having the change_block available for each record is useful

azf20 avatar Dec 09 '21 13:12 azf20

@dotansimha moving this to awaiting release as it was merged?

azf20 avatar Apr 20 '22 09:04 azf20

This can cause problems when there is a subgraph that can "delete" entities. For example: not returning entities with a "balance" of 0. A "complete query" (requesting latest block) won't return entities with balance of 0. But if we do 2 queries, one a year ago, and one today using _change_block, there could be an entity with some balance a year ago, but not today. It won't return the recent one with a balance of 0, and thus we won't know that we should delete it.

I think this should be explained in the documentation, as it can be unexpected behaviour for people expecting something like subscriptions.

daviddavo avatar May 05 '22 15:05 daviddavo

Hi @daviddavo I think that is a good point, that would be unexpected - @dotansimha @lutter interested in what your take is here?

azf20 avatar May 27 '22 22:05 azf20