graph-node graphman rewind not clearing failed status of deployment

Do you want to request a feature or report a bug?

Bug, I think

What is the current behavior?

graphman rewind does not clear the synced or failed columns for a subgraph deployment

If the current behavior is a bug, please provide the steps to reproduce and if possible a minimal demo of the problem.

I had a subgraph deployment which had a postgres entry like so

Note that it's is failed with a fatal error. It was stuck on block 15402220. I recently rewound it with graphman using a command like

graphman rewind 0xa3643c8c86108af9f4f041f8880f333dd0f35a0b30eecdefad909cd6e7489fae 15401220 --sleep 300 <deployment id>

That graphman command succeeded in rewinding the subgraph to block 15401220 (as seen in the above screenshot).

However, when I check the subgraph deployment row for the deployment, I see that it still has a status of synced=true and failed=true even though it's been rewound.

What is the expected behavior?

I'd expect that rewinding a subgraph would clear (or set to false) the synced and failed columns for the deployment specified by the subgraph.

Sep 22 '22 16:09 paymog

Thanks for the feedback, @paymog. Can you confirm if the synced and failed fields are set to the expected values once the subgraph resumes syncing?

Sep 22 '22 17:09 tilacog

I didn't notice those fields update to the expected value after the subgraph resumed indexing. I ended up setting them manually myself using psql.

Sep 22 '22 19:09 paymog

@tilacog have you been able to verify whether this is reproducible?

Oct 18 '22 13:10 paymog

Hi @paymog just to check, you are saying this is the case just immediately after rewind, or also after the subgraph starts syncing again? (Expected current behaviour is that the "failed" status is cleared on restart, while the "synced" status is not cleared after a subgraph has been "synced" for the first time)

Oct 18 '22 15:10 azf20

Immediately after rewind

Oct 18 '22 20:10 paymog

I'm seeing similar behavior. After rewinding a subgraph it was able to sync back to chain head successfully and continue syncing healthily. However, the failed status has not cleared.

Here is the current record in subgraphs.subgraph_deployment (you'll see it does have a fatal_error):

failed                             | t
synced                             | t
earliest_ethereum_block_hash       | \x837489810c9e1b329e3984052d22d472de1a5afc6c03a8052f6ce7a9ece15017
earliest_ethereum_block_number     | 11439999
latest_ethereum_block_hash         | \x6d379780843e7d9ffbbe8b88076cc3752557ed6567963c7367447f3018c7acfd
latest_ethereum_block_number       | 15883215
entity_count                       | 481559
graft_base                         | 
graft_block_hash                   | 
graft_block_number                 | 
fatal_error                        | c49f42d756b75f497bd83d9b2bd48f3a610bebd9d4879c97df53cfbd4f6ff61b
non_fatal_errors                   | {}
health                             | failed
reorg_count                        | 13308
current_reorg_depth                | 0
max_reorg_depth                    | 3
last_healthy_ethereum_block_hash   | 
last_healthy_ethereum_block_number | 
id                                 | 170
firehose_cursor                    | 
debug_fork                         | 
earliest_block_number              | 11439999```

Nov 02 '22 15:11 fordN

Thanks for fixing this @evaporei! Do you know when the fix will be released as part of an updated docker image?

Dec 05 '22 12:12 paymog

graph-node graph-node copied to clipboard

graphman rewind not clearing failed status of deployment

graph-node
graph-node copied to clipboard