graph-node
graph-node copied to clipboard
GraphQL: Add filter that only matches entities changed since a given block
To support use cases where it is necessary to find out what has changed since a given block, we should offer a changed_since_block: $block filter that only matches entities that have been added or updated since $block. For completeness, it would also be good if that somehow included entities that have been deleted since $block, but it's not clear how 'this was deleted' could be indicated in the GraphQL response.
Closing as a subset of https://github.com/graphprotocol/graph-node/issues/2958 (though maybe I should have elaborated this issue 🤔 )
Re-opening this, as we may actually want to tackle this more constrained use-case first
Proposed implementation:
- Extend the
wheretop-level filter to include a_change_block_gtefield (leading underscore in order to avoid collisions)
The following would return all
thingsas of block 1,234, which changed since block 1,000
things(block: { number: 1,234 }, where: { _change_block_gte: { number: 1,000 }) {
id
}
This would not include deleted entities.
@lutter I am interested in if you envisaged this functionality going here, or elsewhere (as a new top-level field?) cc @dotansimha @kamilkisiela
@azf20 I started working on a PR for this. Started from the GraphQL area of the codebase. I noticed that input Block_height is already exists, I guess I can use it as where: { _block: Block_height }?
This way user can query with where: { _block: { number: "XYZ" }} or where: { _block: { number_gte: "XYZ" }}?
Or should I use only _change_block_gte option under where?
@dotansimha that is a very good point - I hadn't thought about the fact that Block_height already has number_gte, which fits perfectly with what we want to do. I think i would call it _change_block rather than _block to make it clear what the entities are being filtered on (the block when they last changed). What do you think @lutter?
You can probably take quite a lot from the existing block top-level filter (which is different, as it defines the state for the overall query, i.e. "query the data as of block 1000", but will use a lot of the same information that will be helpful here)
@dotansimha that is a very good point - I hadn't thought about the fact that
Block_heightalready hasnumber_gte, which fits perfectly with what we want to do. I think i would call it_change_blockrather than_blockto make it clear what the entities are being filtered on (the block when they last changed). What do you think @lutter? You can probably take quite a lot from the existingblocktop-level filter (which is different, as it defines the state for the overall query, i.e. "query the data as of block 1000", but will use a lot of the same information that will be helpful here)
I don't think we can reuse the Block_height input type here, since it adds other filter capabilities, too, like { number: NNN } and { hash: 0xdeadbeef } which doesn't really make sense in the where clause. I think we have to add just the _change_block_gte field as a special case.
It is an increase in the scope, but I think number and hash do make sense as values for the where clause, i.e.
where: {_change_block: {number: 100} }
Would get all entities which changed in that one block (excluding deletions)?
I'll start with just where: { _chane_block: { number_gte: 100 }} as separate input, and we can always extend the input type there (or, switch to use a different one) before merging.
So I'll continue with that, and then we can discuss about different block filter in parallel.
@azf20 @lutter @Jannis I created a PR fix adding this new filter: https://github.com/graphprotocol/graph-node/pull/3014 , do you think https://github.com/graphprotocol/graph-node/issues/2958 is also needed? (I guess it is? Because you can now filter base on that data, but you can't tell what is the actual change_block of each record found?)
thanks @dotansimha - as discussed this morning I think we can wait on #2958, based on feedback on #3014 we can establish whether having the change_block available for each record is useful
@dotansimha moving this to awaiting release as it was merged?
This can cause problems when there is a subgraph that can "delete" entities. For example: not returning entities with a "balance" of 0. A "complete query" (requesting latest block) won't return entities with balance of 0. But if we do 2 queries, one a year ago, and one today using _change_block, there could be an entity with some balance a year ago, but not today. It won't return the recent one with a balance of 0, and thus we won't know that we should delete it.
I think this should be explained in the documentation, as it can be unexpected behaviour for people expecting something like subscriptions.
Hi @daviddavo I think that is a good point, that would be unexpected - @dotansimha @lutter interested in what your take is here?