stacks-core icon indicating copy to clipboard operation
stacks-core copied to clipboard

testnet node restart causes sync from scratch

Open pseudozach opened this issue 1 year ago • 6 comments

I have a testnet node that had stalled at height 153329 so I restarted it and now it's failing with below error and starts to sync from scratch.

There was nothing changed on the file system so it can't be any permission or DB issue.

cc @wileyj

...
stacks-blockchain      | INFO [1712543044.720830] [testnet/stacks-node/src/run_loop/boot_nakamoto.rs:205] [epoch-2/3-boot] Failed to open Sortition DB while checking current burn height, assuming height = 0
stacks-blockchain      | INFO [1712543044.983827] [stackslib/src/burnchains/bitcoin/spv.rs:1286] [main] Syncing Bitcoin headers: 57.6% (1490000 out of 2585590)
...

I definitely have data so not sure why it's somehow inaccessible, here's the folder contents as requested by jw

zach@lnswap-1:~/stacks-blockchain-docker/persistent-data/testnet/stacks-blockchain/xenon$ ls -l
total 814032
-rw-r--r-- 1 root zach   2203648 Mar 30 04:55 atlas.sqlite
drwxr-sr-x 3 root zach      4096 Apr  8 01:48 burnchain
drwxr-sr-x 5 root zach      4096 Apr  8 01:48 chainstate
-rw-r--r-- 1 root zach 829743104 Apr  4 05:03 headers.sqlite
-rw-r--r-- 1 root zach   1523712 Apr  8 01:48 headers.sqlite.reorg
-rw-r--r-- 1 root zach     57344 Apr  8 01:48 peer.sqlite
-rw-r--r-- 1 root zach     28672 Mar 25 20:05 stacker_db.sqlite
zach@lnswap-1:~/stacks-blockchain-docker/persistent-data/testnet/stacks-blockchain/xenon$ du -sh chainstate/
21G     chainstate/

update - it's same behavior after restoring chainstate from hiro archives and also same with both 2.5.0.0.0-rc1 and next image

# STACKS_BLOCKCHAIN_VERSION=2.5.0.0.0-rc1
STACKS_BLOCKCHAIN_VERSION=next
STACKS_BLOCKCHAIN_API_VERSION=7.10.0-beta.1

pseudozach avatar Apr 08 '24 02:04 pseudozach

ping @CharlieC3 @obycode @kantai

interesting issue that i haven't seen before

wileyj avatar Apr 08 '24 02:04 wileyj

this was from a node that was at chain tip, and this happened on a restart. @pseudozach can you copy/paste your current chainstate dir? (ls -alh is probably fine

wileyj avatar Apr 08 '24 02:04 wileyj

https://github.com/stacks-network/stacks-core/blob/next/testnet/stacks-node/src/run_loop/boot_nakamoto.rs#L205

https://github.com/stacks-network/stacks-core/blob/9b377f9c36357d6bdc7df4134c0bfd358c42c651/stackslib/src/net/chat.rs#L2751

wileyj avatar Apr 08 '24 02:04 wileyj

this was from a node that was at chain tip, and this happened on a restart. @pseudozach can you copy/paste your current chainstate dir? (ls -alh is probably fine

sure

zach@lnswap-1:~/stacks-blockchain-docker/persistent-data/testnet/stacks-blockchain/xenon/chainstate$ ls -alh
total 9.7M
drwxr-sr-x     5 root zach  4.0K Apr  8 01:48 .
drwxr-sr-x     4 root zach  4.0K Apr  8 01:48 ..
drwxr-sr-x 59329 root zach 1020K Apr  8 01:48 blocks
drwxr-sr-x     2 root zach  4.0K Apr  8 01:48 estimates
-rw-r--r--     1 root zach  8.6M Apr  8 01:48 mempool.sqlite
-rw-r--r--     1 root zach   12K Mar 25 17:01 tx_tracking.sqlite
drwxr-sr-x     3 root zach  4.0K Apr  8 01:48 vm

pseudozach avatar Apr 08 '24 02:04 pseudozach

Can you run ls -lah on the xenon/burnchain directory as well?

kantai avatar Apr 08 '24 16:04 kantai

this may have been a config issue with the working_dir key, but let's leave it open in case it comes up again. i've asked @pseudozach to collect logs in the case it stalls again. currently the node is working as expected.

wileyj avatar Apr 08 '24 23:04 wileyj