graph-node [Bug] Impossible combination of entity operations

Bug report

A subgraph that was running without issues on v0.34.0 suddenly started failing in v0.35.0. This subgraph is deployed across many networks and they all started failing with this issue which suggests that this is a regression in v0.35.0.

Relevant log output

May 28 13:26:21.945 ERRO Subgraph failed with non-deterministic error: Failed to transact block operations: internal constraint violated: impossible combination of entity operations: Remove { key: EntityKey(SplitRecipient[0x7c29ca34b44d388ab031ecce7781f2420e1e5c99-0xfa9aad02ffede509520e27ef329ee28871a76828-5], cr=0), block: 15264023 } and then Remove { key: EntityKey(SplitRecipient[0x7c29ca34b44d388ab031ecce7781f2420e1e5c99-0xfa9aad02ffede509520e27ef329ee28871a76828-5], cr=0), block: 15267303 }, retry_delay_s: 108, attempt: 0, sgd: 1, subgraph_id: QmcpChELh7eJShPHvG5zLBUYBsBQby9KZ8roh7BrT2Yp5B, component: SubgraphInstanceManager

IPFS hash

QmcpChELh7eJShPHvG5zLBUYBsBQby9KZ8roh7BrT2Yp5B

Subgraph name or link to explorer

No response

Some information to help us out

[X] Tick this box if this bug is caused by a regression found in the latest release.
[ ] Tick this box if this bug is specific to the hosted service.
[X] I have searched the issue tracker to make sure this issue is not a duplicate.

OS information

Linux

May 29 '24 13:05 paymog

we also see Failed to transact block operations: internal constraint violated: Batches must go forward. Can't append a batch with block pointer #114200817 as another issue happening on these subgraphs but this one happens less reliably.

May 29 '24 13:05 paymog

Seems like this issue might be related to batching. Trying to bisect and the issue doesn't seem to happen reliably on any commits. Thought I bisected down to 31943fc706c84e8afe4a3677b7cf172339d72461 but then I went to previous commit to test (and didn't find issues). Changed back to 31943fc706c84e8afe4a3677b7cf172339d72461 and now the issue isn't happening. Very unusual. It also doesn't make sense that this would be the offending commit.

May 29 '24 14:05 paymog

Now I'm thinking this might be a subgraph bug that wasn't revealed until we upgraded to v0.35.0

May 29 '24 15:05 paymog

Setting GRAPH_STORE_WRITE_BATCH_SIZE=0 seems to resolve the issue

May 29 '24 15:05 paymog

The only commits I see between 0.34.0 and 0.35.0 related to batching are for enabling/disabling batching based on whether the subgraph is caught up and in my local testing the subgraph is in the process of catching up so batching is definitely enabled. Did any other batching changes happen between these two releases? cc @leoyvens @lutter

Alternatively, could there be some changes to the logic that affect loading entities? The subgraph in question has a flow like:

parse list of addresses in event
load the relevant entity
for any addresses that existed in the entity before but do not exist in the current event, use store.remove to remove them
save the entity with the latest list of addresses

Since we see two remove modifications here, could it be that something is going on with the step 4 (not properly saving before committing) or step 2 (not properly loading during a batch)?

May 29 '24 16:05 paymog

Looks like this issue has been open for 6 months with no activity. Is it still relevant? If not, please remember to close it.

Dec 07 '24 00:12 github-actions[bot]

I seem to have run into this same issue. With subgraph IPFS hash QmfA2FrsjAz5EEK5NuyqDGhE33Umnhbzbc1YEeVVDb6TgL, I get:

Error: Failed to transact block operations: internal constraint violated: impossible combination of entity operations: Remove { key: EntityKey(PackShareContent[0x575700002a090000], cr=0), block: 67096196 } and then Remove { key: EntityKey(PackShareContent[0x575700002a090000], cr=0), block: 67100212 }

What the subgraph does is similar to the OP's: For handlers of certain events, it will iterate through a list of entities (that are linked to another entity via 1:n relation and a derived field), removing them from the store. In the first of the two blocks in question (from the error), an event is fired that will trigger a "full refresh" that takes some time to process. In the second of the two blocks, an event is fired that refreshes just a select few entities. The error sounds like the indexer is trying to process them in parallel, and thus a duplicate remove is attempted. Or the second event's handler somehow does not correctly see that the entities have already been removed previously / an update is in progress for them.

Jan 26 '25 13:01 domob1812

@domob1812 Sorrry for the long radio silence, just coming back to that: can you confirm that in your case the subgraph removes entities without first checking if they exist? That should be ok, I just want to make sure that that's the case here. I can then change graph-node to allow that.

Apr 07 '25 21:04 lutter

@lutter No worries, I have updated my subgraph to not use immutable entities just in case and it is working fine.

The subgraph uses store.remove to remove entities which are been enumerated by a derived entity (using store.loadRelated in the code generated from the schema).

Apr 10 '25 13:04 domob1812

I just opened a PR that will fix this problem. Once that's out and deployed, you should be able to go back to using immutable entities as that will be much faster for queries

Apr 10 '25 16:04 lutter

graph-node graph-node copied to clipboard

[Bug] Impossible combination of entity operations

Bug report

Relevant log output

IPFS hash

Subgraph name or link to explorer

Some information to help us out

OS information

graph-node
graph-node copied to clipboard