graph-node [Bug] Rewinding a subgraph causes a constraint violation in graph-node that in turn causes indexer-agent to crashloop

Bug report

graph-node:v0.34.1 indexer-agent:v0.20.22

Activities that were undertaken before observing this bug:

Cleared call_cache for Arbitrum as part of a complex subgraph sync performance troubleshooting exercise via psql
Rewound a specific problematic subgraph, Silo Finance Arbitrum, QmTMKqty5yZvZtB3SwzXUG92aZUH1YQw3VjByGw4wgaMhW to block 1 using graphman
Observed the above subgraph syncing to ~130m blocks, then stalled.
Checked graph-node logs and found related error (see log output)
Observed indexer-agent complaining about same issue and crash looping - cannot use the agent at all right now to manage subgraphs (see log output)

IMPACT: Production Indexer at risk; we cannot manage our online and offline allocations while we have this issue - ideally need a temp fix for the specific symptoms. Would graphman drop resolve the issue? Would the graph-node and indexer-agent be able to handle that and start syncing the sub again from scratch given this is a subgraph in flight with live allocations?

Relevant log output

----- GRAPH-NODE
Apr 04 13:24:34.037 ERRO Subgraph instance failed to run: internal constraint violated: Subgraph writer for QmTMKqty5yZvZtB3SwzXUG92aZUH1YQw3VjByGw4wgaMhW[sgd622] is not running, sgd: 622, subgraph_id: QmTMKqty5yZvZtB3SwzXUG92aZUH1YQw3VjByGw4wgaMhW, component: SubgraphInstanceManager
Apr 04 13:48:18.741 WARN Price provider Removed: 0x8dca64a43865454f41aa1a3cf0140eb89f2c08aa53871235ecbe46b6a309a1e3, data_source: PriceProvidersRepository, sgd: 622, subgraph_id: QmTMKqty5yZvZtB3SwzXUG92aZUH1YQw3VjByGw4wgaMhW, component: SubgraphInstanceManager > UserMapping
Apr 04 13:48:18.742 ERRO Oracle was not found when trying to remove it at txn: 0x8dca64a43865454f41aa1a3cf0140eb89f2c08aa53871235ecbe46b6a309a1e3, data_source: PriceProvidersRepository, sgd: 622, subgraph_id: QmTMKqty5yZvZtB3SwzXUG92aZUH1YQw3VjByGw4wgaMhW, component: SubgraphInstanceManager > UserMapping

----- INDEXER-AGENT
{"level":50,"time":1712241706593,"pid":1,"hostname":"268ad9e1400b","name":"IndexerAgent","component":"GraphNode","err":{"type":"IndexerError","message":"Failed to query indexing status API","stack":"IndexerError: Failed to query indexing status API\n    at indexerError (/opt/indexer/packages/indexer-common/dist/errors.js:173:12)\n    at GraphNode.<anonymous> (/opt/indexer/packages/indexer-common/dist/graph-node.js:146:55)\n    at Generator.next (<anonymous>)\n    at fulfilled (/opt/indexer/packages/indexer-common/dist/graph-node.js:5:58)\n    at processTicksAndRejections (node:internal/process/task_queues:96:5)","code":"IE018","explanation":"https://github.com/graphprotocol/indexer/blob/main/docs/errors.md#ie018","cause":{"type":"CombinedError","message":"[GraphQL] Store error: internal constraint violated: the entityCount for QmTMKqty5yZvZtB3SwzXUG92aZUH1YQw3VjByGw4wgaMhW is not representable as a u64","name":"CombinedError","graphQLErrors":[{"message":"Store error: internal constraint violated: the entityCount for QmTMKqty5yZvZtB3SwzXUG92aZUH1YQw3VjByGw4wgaMhW is not representable as a u64"}],"response":{"size":0,"timeout":0}}},"msg":"Failed to query indexing status API"}

IPFS hash

No response

Subgraph name or link to explorer

https://thegraph.com/explorer/subgraphs/2ufoztRpybsgogPVW6j9NTn1JmBWFYPKbP7pAabizADU?view=Overview&chain=arbitrum-one

Some information to help us out

[ ] Tick this box if this bug is caused by a regression found in the latest release.
[ ] Tick this box if this bug is specific to the hosted service.
[X] I have searched the issue tracker to make sure this issue is not a duplicate.

OS information

Linux

Apr 04 '24 15:04 cryptovestor21

the entityCount for QmTMKqty5yZvZtB3SwzXUG92aZUH1YQw3VjByGw4wgaMhW is not representable as a u64

Maybe the rewind somehow turned the entity count negative. Which is a bug of course.

Apr 04 '24 15:04 leoyvens

@leoyvens I think the problem was coming from that rewind to block 1 when the startblock was actually 51880000 That means the graphnode doesn't handle that scenario, and it created all that chaos.

Apr 05 '24 11:04 trader-payne

Looks like this issue has been open for 6 months with no activity. Is it still relevant? If not, please remember to close it.

Oct 03 '24 00:10 github-actions[bot]

graph-node graph-node copied to clipboard

[Bug] Rewinding a subgraph causes a constraint violation in graph-node that in turn causes indexer-agent to crashloop

Bug report

Relevant log output

IPFS hash

Subgraph name or link to explorer

Some information to help us out

OS information

graph-node
graph-node copied to clipboard