cumulus icon indicating copy to clipboard operation
cumulus copied to clipboard

Node being stucked

Open rjonczy opened this issue 2 years ago • 7 comments

Sometimes, our parachain node gets stucks at some block.

Parachain, related blocks looks like this:

2023-08-23 12:35:15.944  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Preparing  0.0 bps, target=#3090498 (21 peers), best: #3089160 (0xeac1…b76e), finalized #3084883 (0x372a…d05f), ⬇ 0.6kiB/s ⬆ 1.0kiB/s  
2023-08-23 12:35:20.944  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Preparing  0.0 bps, target=#3090498 (21 peers), best: #3089160 (0xeac1…b76e), finalized #3084883 (0x372a…d05f), ⬇ 3.0kiB/s ⬆ 0.2kiB/s  
2023-08-23 12:35:25.945  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Preparing  0.0 bps, target=#3090498 (22 peers), best: #3089160 (0xeac1…b76e), finalized #3084883 (0x372a…d05f), ⬇ 0.3kiB/s ⬆ 0.3kiB/s  
2023-08-23 12:35:30.945  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Preparing  0.0 bps, target=#3090498 (22 peers), best: #3089160 (0xeac1…b76e), finalized #3084883 (0x372a…d05f), ⬇ 0.2kiB/s ⬆ 0.2kiB/s  
2023-08-23 12:35:35.946  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Preparing  0.0 bps, target=#3090498 (22 peers), best: #3089160 (0xeac1…b76e), finalized #3084883 (0x372a…d05f), ⬇ 174.2kiB/s ⬆ 0.2kiB/s
2023-08-23 12:35:40.946  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Preparing  0.0 bps, target=#3090498 (22 peers), best: #3089160 (0xeac1…b76e), finalized #3084883 (0x372a…d05f), ⬇ 16.4kiB/s ⬆ 1.0kiB/s 
2023-08-23 12:35:45.947  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Preparing  0.0 bps, target=#3090498 (22 peers), best: #3089160 (0xeac1…b76e), finalized #3084883 (0x372a…d05f), ⬇ 3.0kiB/s ⬆ 0.2kiB/s  
2023-08-23 12:35:50.947  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Preparing  0.0 bps, target=#3090498 (22 peers), best: #3089160 (0xeac1…b76e), finalized #3084883 (0x372a…d05f), ⬇ 0.2kiB/s ⬆ 0.2kiB/s  
2023-08-23 12:35:55.947  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Preparing  0.0 bps, target=#3090498 (22 peers), best: #3089160 (0xeac1…b76e), finalized #3084883 (0x372a…d05f), ⬇ 6.3kiB/s ⬆ 0.2kiB/s  
2023-08-23 12:36:00.948  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Preparing  0.0 bps, target=#3090498 (22 peers), best: #3089160 (0xeac1…b76e), finalized #3084883 (0x372a…d05f), ⬇ 0.4kiB/s ⬆ 0.7kiB/s  
2023-08-23 12:36:05.948  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Preparing  0.0 bps, target=#3090498 (22 peers), best: #3089160 (0xeac1…b76e), finalized #3084883 (0x372a…d05f), ⬇ 75.6kiB/s ⬆ 0.2kiB/s 
2023-08-23 12:36:10.949  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Preparing  0.0 bps, target=#3090498 (22 peers), best: #3089160 (0xeac1…b76e), finalized #3084883 (0x372a…d05f), ⬇ 121.3kiB/s ⬆ 0.2kiB/s
2023-08-23 12:36:15.949  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Preparing  0.0 bps, target=#3090498 (21 peers), best: #3089160 (0xeac1…b76e), finalized #3084883 (0x372a…d05f), ⬇ 0.4kiB/s ⬆ 0.8kiB/s  
2023-08-23 12:36:20.950  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Preparing  0.0 bps, target=#3090498 (21 peers), best: #3089160 (0xeac1…b76e), finalized #3084883 (0x372a…d05f), ⬇ 3.0kiB/s ⬆ 0.2kiB/s  

Can you help me to understand how to debug why it is stucked ? And how to prevent something like this in future?

rjonczy avatar Aug 23 '23 14:08 rjonczy

Possibly a duplicate of https://github.com/paritytech/polkadot-sdk/issues/13

cc @skunert

altonen avatar Aug 23 '23 15:08 altonen

Yes, could be the same issue.

Which node version are you using? What was the command to launch the node? Did you see this once or is this happening regularly?

skunert avatar Aug 23 '23 15:08 skunert

@skunert it is running:

/nix/store/yy51dmfs4h75ci73visv2siaa3gfb7mg-composable/bin/composable \
-lsubstrate_basic_authorship=trace,txpool=trace,sync::import-queue=trace,txpool::api=trace \
--chain=picasso \
--name=xxx \
--listen-addr=/ip4/0.0.0.0/tcp/40334 \
--prometheus-external \
--prometheus-port 19615 \
--base-path /data \
--execution=wasm \
--pruning=100800 \
--database=paritydb \
--sync fast \
--ws-external \
--rpc-external \
--unsafe-rpc-external \
--rpc-methods=unsafe \
--ws-port 8844 \
--unsafe-ws-external \
--rpc-cors=all \
--rpc-port 8833 \
--in-peers 1000 \
--out-peers 1000 \
--log=ibc_transfer=trace,pallet_ibc=trace \
--offchain-worker=always \
--enable-offchain-indexing=true \
--ws-max-connections 10000 \
--rpc-max-subscriptions-per-connection 10240 \
-- \
--execution=wasm \
--listen-addr=/ip4/0.0.0.0/tcp/40333 \
--ws-external \
--rpc-external \
--unsafe-rpc-external \
--rpc-methods=unsafe \
--ws-port 8845 \
--rpc-port 8834 \
--pruning=201600 \
--database=paritydb \
--sync fast \
--ws-max-connections 10000 \
--rpc-max-subscriptions-per-connection 10240 \
--unsafe-ws-external \
--name=xxx \
--rpc-cors=all

Node is still in this state. Didn't restart it yet.

rjonczy avatar Aug 23 '23 17:08 rjonczy

@skunert oka, so i did:

  • stopped node
  • wiped-out data
  • synched from scratch

and again it got stucked at some block (#3096689), however now we are in a state:

❯ head xxx.logs
2023-08-24 08:46:00.723  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Downloading state, 0%, 0.00 Mib (22 peers), best: #3096689 (0x930f…8dda), finalized #3096689 (0x930f…8dda), ⬇ 0.4kiB/s ⬆ 0.3kiB/s
2023-08-24 08:46:05.723  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Downloading state, 0%, 0.00 Mib (22 peers), best: #3096689 (0x930f…8dda), finalized #3096689 (0x930f…8dda), ⬇ 0.2kiB/s ⬆ 0.2kiB/s
2023-08-24 08:46:10.725  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Downloading state, 0%, 0.00 Mib (22 peers), best: #3096689 (0x930f…8dda), finalized #3096689 (0x930f…8dda), ⬇ 195.9kiB/s ⬆ 0.2kiB/s
2023-08-24 08:46:15.726  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Downloading state, 0%, 0.00 Mib (22 peers), best: #3096689 (0x930f…8dda), finalized #3096689 (0x930f…8dda), ⬇ 1.3kiB/s ⬆ 0.5kiB/s
2023-08-24 08:46:20.727  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Downloading state, 0%, 0.00 Mib (22 peers), best: #3096689 (0x930f…8dda), finalized #3096689 (0x930f…8dda), ⬇ 6.5kiB/s ⬆ 0.2kiB/s
2023-08-24 08:46:25.728  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Downloading state, 0%, 0.00 Mib (22 peers), best: #3096689 (0x930f…8dda), finalized #3096689 (0x930f…8dda), ⬇ 0.2kiB/s ⬆ 0.2kiB/s
2023-08-24 08:46:30.766  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Downloading state, 0%, 0.00 Mib (22 peers), best: #3096689 (0x930f…8dda), finalized #3096689 (0x930f…8dda), ⬇ 0.5kiB/s ⬆ 0.3kiB/s
2023-08-24 08:46:35.767  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Downloading state, 0%, 0.00 Mib (21 peers), best: #3096689 (0x930f…8dda), finalized #3096689 (0x930f…8dda), ⬇ 2.8kiB/s ⬆ 0.3kiB/s
2023-08-24 08:46:40.768  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Downloading state, 0%, 0.00 Mib (21 peers), best: #3096689 (0x930f…8dda), finalized #3096689 (0x930f…8dda), ⬇ 0.2kiB/s ⬆ 0.2kiB/s
2023-08-24 08:46:45.768  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Downloading state, 0%, 0.00 Mib (21 peers), best: #3096689 (0x930f…8dda), finalized #3096689 (0x930f…8dda), ⬇ 103.3kiB/s ⬆ 0.6kiB/s
❯ tail xxx.logs
2023-08-24 10:13:36.534  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Downloading state, 0%, 0.00 Mib (21 peers), best: #3096689 (0x930f…8dda), finalized #3096689 (0x930f…8dda), ⬇ 0.3kiB/s ⬆ 0.3kiB/s
2023-08-24 10:13:41.535  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Downloading state, 0%, 0.00 Mib (22 peers), best: #3096689 (0x930f…8dda), finalized #3096689 (0x930f…8dda), ⬇ 6.0kiB/s ⬆ 0.4kiB/s
2023-08-24 10:13:46.535  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Downloading state, 0%, 0.00 Mib (22 peers), best: #3096689 (0x930f…8dda), finalized #3096689 (0x930f…8dda), ⬇ 6.0kiB/s ⬆ 1.3kiB/s
2023-08-24 10:13:51.535  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Downloading state, 0%, 0.00 Mib (22 peers), best: #3096689 (0x930f…8dda), finalized #3096689 (0x930f…8dda), ⬇ 100.2kiB/s ⬆ 0.3kiB/s
2023-08-24 10:13:56.536  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Downloading state, 0%, 0.00 Mib (22 peers), best: #3096689 (0x930f…8dda), finalized #3096689 (0x930f…8dda), ⬇ 0.2kiB/s ⬆ 0.2kiB/s
2023-08-24 10:14:01.536  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Downloading state, 0%, 0.00 Mib (22 peers), best: #3096689 (0x930f…8dda), finalized #3096689 (0x930f…8dda), ⬇ 2.9kiB/s ⬆ 0.1kiB/s
2023-08-24 10:14:06.537  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Downloading state, 0%, 0.00 Mib (22 peers), best: #3096689 (0x930f…8dda), finalized #3096689 (0x930f…8dda), ⬇ 0.3kiB/s ⬆ 0.3kiB/s
2023-08-24 10:14:11.537  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Downloading state, 0%, 0.00 Mib (22 peers), best: #3096689 (0x930f…8dda), finalized #3096689 (0x930f…8dda), ⬇ 0.2kiB/s ⬆ 0.5kiB/s
2023-08-24 10:14:16.537  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Downloading state, 0%, 0.00 Mib (22 peers), best: #3096689 (0x930f…8dda), finalized #3096689 (0x930f…8dda), ⬇ 3.0kiB/s ⬆ 0.4kiB/s
2023-08-24 10:14:21.538  INFO tokio-runtime-worker substrate: [Parachain] ⚙️  Downloading state, 0%, 0.00 Mib (21 peers), best: #3096689 (0x930f…8dda), finalized #3096689 (0x930f…8dda), ⬇ 0.4kiB/s ⬆ 0.4kiB/s

and the same log keeps going for hrs.

rjonczy avatar Aug 24 '23 10:08 rjonczy

I will take a look :eyes:

skunert avatar Aug 24 '23 12:08 skunert

In the meantime, you could pass sync=trace to get more debug logs.

skunert avatar Aug 24 '23 12:08 skunert

@skunert adding this flag. Will let you know if its stucked again and will bring more logs.

rjonczy avatar Aug 24 '23 13:08 rjonczy