graph-node icon indicating copy to clipboard operation
graph-node copied to clipboard

[Bug] Flaky DeploymentNotFound Error when deploying subgraphs

Open kevin-satsuma opened this issue 2 years ago • 5 comments

Bug report

Running a self hosted graph-node on v0.31.0.

Noticed a flaky siutation where sometimes the subgraph deployment would fail due to the DeploymentNotFound error. Even though graph-cli reports an error, the subgraph seems to start indexing properly.

Subsequent deploys of the same IPFS hash will result in a different error: duplicate key value violates unique constraint "subgraph_deployment_id_key".

Traced down into the code to see that it is likely coming from create_deployment_internal.

set_on_sync called by create_deployment is one potential spot, but checking the subgraph_manifest table shows no missing rows.

create_subgraph_version is another possible spot, but the subgraph_deployment table looks fine as well.

Perhaps these tables were not populated correctly at time of the error due to a race condition or database issue? Looking at metrics, the database was not under a lot of load at the time.

Would appreciate any second eyes on this, as it is a flaky error that has not been successfully reproduced yet while debugging.

Relevant log output

# Log from graph-node on first deployment
Jul 11 16:44:26.941 ERRO subgraph_deploy failed, params: SubgraphDeployParams { name: SubgraphName("c47020ff-bcc6-44f4-8296-9b0ec6cb783b"), ipfs_hash: DeploymentHash("QmYSk536LvM4fqqqCVDzjCqsNochPcnwPGXFkWbUCHzLBP"), node_id: None, debug_fork: None, history_blocks: None }, error: SubgraphDeploymentError(DeploymentNotFound("QmYSk536LvM4fqqqCVDzjCqsNochPcnwPGXFkWbUCHzLBP")), component: JsonRpcServer

# Log from graph-node on subsequent deployment
Jul 12 10:14:25.099 ERRO subgraph_deploy failed, params: SubgraphDeployParams { name: SubgraphName("d94932a4-daca-4443-84dd-fddc1169cab7"), ipfs_hash: DeploymentHash("QmYSk536LvM4fqqqCVDzjCqsNochPcnwPGXFkWbUCHzLBP"), node_id: None, debug_fork: None, history_blocks: None }, error: SubgraphDeploymentError(Unknown(duplicate key value violates unique constraint "subgraph_deployment_id_key")), component: JsonRpcServer

IPFS hash

No response

Subgraph name or link to explorer

No response

Some information to help us out

  • [ ] Tick this box if this bug is caused by a regression found in the latest release.
  • [ ] Tick this box if this bug is specific to the hosted service.
  • [X] I have searched the issue tracker to make sure this issue is not a duplicate.

OS information

Linux

kevin-satsuma avatar Jul 13 '23 01:07 kevin-satsuma

Hey @kevin-satsuma can you elaborate when does this DeploymentNotFound error occur? Is it random?

incrypto32 avatar Jul 24 '23 06:07 incrypto32

Sure! The DeploymentNotFound error seems to be random so far. When the error occurs, resource usage on the indexer and database both look normal as well. This has made it difficult to investigate without the ability to reproduce the error on demand.

kevin-satsuma avatar Jul 27 '23 15:07 kevin-satsuma

hey @kevin-satsuma is this running in combined mode, with a single database (i.e. not using Graph Node sharding)?

azf20 avatar Aug 09 '23 12:08 azf20

@azf20 The graph-node instance showing this error is running with default node_role of combined-node, but we do not send any queries to it. It uses a single database and does not use Graph Node sharding.

kevin-satsuma avatar Aug 09 '23 15:08 kevin-satsuma

Looks like this issue has been open for 6 months with no activity. Is it still relevant? If not, please remember to close it.

github-actions[bot] avatar Feb 06 '24 00:02 github-actions[bot]