graph-node icon indicating copy to clipboard operation
graph-node copied to clipboard

[Bug] deterministic substreams errors are not reported at the graph-node level

Open madumas opened this issue 2 years ago • 6 comments

Bug report

A subgraph-powerered-subgraph that fails with a deterministic error is not properly handled by graph-node. It should detect the error, and report the subgraph as "failed" in a deterministic way. Instead, graph-node reports the subgraph to be healthy and continually retries the substream

graph-node: v0.32.0 firehose-ethereum: v1.4.10

Relevant log output

Nov 07 14:12:41.040 INFO Blockstreams connected, provider: substreams, deployment: QmT2nEwS9YWFzLE39Ujuv2oNVhgbctc8VUtB28RXb4wUbY, sgd: 351, subgraph_id: QmT2nEwS9YWFzLE39Ujuv2oNVhgbctc8VUtB28RXb4wUbY, component: SubstreamsBlockStream
Nov 07 14:12:41.043 INFO received err, provider: substreams, deployment: QmT2nEwS9YWFzLE39Ujuv2oNVhgbctc8VUtB28RXb4wUbY, sgd: 351, subgraph_id: QmT2nEwS9YWFzLE39Ujuv2oNVhgbctc8VUtB28RXb4wUbY, component: SubstreamsBlockStream
Nov 07 14:12:41.043 ERRO An error occurred while streaming blocks: status: Unknown, message: "rpc error: code = InvalidArgument desc = step new irr: handler step new: execute modules: applying executor results \"graph_out\": execute: maps wasm call: block 9974225: module \"graph_out\": wasm execution failed deterministically: panic in the wasm: \"called `Option::unwrap()` on a `None` value\" at /Users/mack/.cargo/registry/src/github.com-1ecc6299db9ec823/substreams-0.5.10/src/scalar.rs:404:25

IPFS hash

QmT2nEwS9YWFzLE39Ujuv2oNVhgbctc8VUtB28RXb4wUbY

Subgraph name or link to explorer

No response

Some information to help us out

  • [ ] Tick this box if this bug is caused by a regression found in the latest release.
  • [ ] Tick this box if this bug is specific to the hosted service.
  • [X] I have searched the issue tracker to make sure this issue is not a duplicate.

OS information

None

madumas avatar Nov 07 '23 14:11 madumas

thanks @madumas! @mangas checking if this is the currently expected behaviour, or if we have some error handling in place for substreams?

azf20 avatar Dec 01 '23 21:12 azf20

We currently don't have this information surfaced through the GRPC types. The current Error definition looks like this:

message Error {
  string module = 1;
  string reason = 2;
  repeated string logs = 3;
  // FailureLogsTruncated is a flag that tells you if you received all the logs or if they
  // were truncated because you logged too much (fixed limit currently is set to 128 KiB).
  bool logs_truncated = 4;
}

The correct way to handle this would be for a deterministic flag (or similar) to be added so that the graph-node can treat it as such.

mangas avatar Dec 04 '23 12:12 mangas

@mangas we don't need a flag: the condition is in the Error Code: code = InvalidArgument

Invalid Argument is returned from Substreams when the error is "not due to the state of the system".

I'm not certain why you are getting "Status: Unknown" however... I will investigate that

sduchesneau avatar Dec 05 '23 16:12 sduchesneau

The GRPC issue has been fixed in substreams, pulled to firehose-core https://github.com/streamingfast/firehose-core/releases/tag/v0.2.4

Status field will show as InvalidArgument instead of Unknown.

Now, can graph-node be configured to consider "InvalidArgument" status code as a deterministic error ?

@mangas

sduchesneau avatar Dec 05 '23 18:12 sduchesneau

@sduchesneau is that the only error code which is indicative of a deterministic error?

azf20 avatar Dec 09 '23 21:12 azf20

Looks like this issue has been open for 6 months with no activity. Is it still relevant? If not, please remember to close it.

github-actions[bot] avatar Jun 13 '24 00:06 github-actions[bot]

todo: Investigate status

alex-pakalniskis avatar Oct 21 '24 15:10 alex-pakalniskis