tomochain icon indicating copy to clipboard operation
tomochain copied to clipboard

Cannot sync from genesis (or snapshot) with defunct bootnodes

Open barryz opened this issue 1 year ago • 3 comments

Hi dev team, we're from the DeBank and Rabby Wallet team. We're spinning up a Tomo full-node on Mainnet. We've done all the steps by following the official doc. We've set the env while starting the node with binary, the node isn't make a connection with bootnode in the doc and prints debug logs as follows:

image

Could you help us connect with the active mainnet bootnodes, and proceed further with the setup and syncing process?

We were kindly waiting for the reply asap.

Thanks.

barryz avatar Jul 16 '24 05:07 barryz

@barryz have you had any luck with this in the meantime?

We're also trying to sync from latest (July) archive snapshot, and seeing similar behavior, although not the exact same error messages - this is looping:

node-1  | TRACE[09-13|20:16:20] Starting bonding ping/pong               id=c8f2f0643527d4ef known=false failcount=9  age=479516h16m20.301921694s
node-1  | TRACE[09-13|20:16:20] Starting bonding ping/pong               id=fd3da177f9492a39 known=false failcount=9  age=479516h16m20.301966669s
node-1  | TRACE[09-13|20:16:20] Starting bonding ping/pong               id=97f0ca95a653e3c4 known=false failcount=10 age=479516h16m20.301971588s
node-1  | TRACE[09-13|20:16:20] Starting bonding ping/pong               id=b72927f349f3a27b known=false failcount=9  age=479516h16m20.302024748s
node-1  | TRACE[09-13|20:16:20] >> PING TOMO/v4                          addr=104.248.98.60:30301   err=nil
node-1  | TRACE[09-13|20:16:20] >> PING TOMO/v4                          addr=3.212.20.0:30301      err=nil
node-1  | TRACE[09-13|20:16:20] >> PING TOMO/v4                          addr=188.166.207.189:30301 err=nil
node-1  | TRACE[09-13|20:16:20] >> PING TOMO/v4                          addr=3.212.20.0:30303      err=nil
node-1  | TRACE[09-13|20:16:20] Dial task done                           task="discovery lookup"
node-1  | TRACE[09-13|20:16:20] Skipping dial candidate                  id=c8f2f0643527d4ef addr=104.248.98.60:30301   err="recently dialed"
node-1  | TRACE[09-13|20:16:20] New dial task                            task="discovery lookup"
node-1  | DEBUG[09-13|20:16:23] Recalculated downloader QoS values       rtt=5s confidence=1.000 ttl=5s

On verbosity 3, node just logs:

node-1  | INFO [09-12|20:41:59] HTTP endpoint opened                     url=http://0.0.0.0:8545      cors=* vhosts=*
node-1  | INFO [09-12|20:41:59] WebSocket endpoint opened                url=ws://[::]:8546
node-1  | INFO [09-12|20:42:00] Unlocked account                         address=0x7E43864BEC15bAbc5D9c25918aD241dddD46A695
node-1  | INFO [09-12|20:42:00] Etherbase automatically configured       address=0x7E43864BEC15bAbc5D9c25918aD241dddD46A695
node-1  | INFO [09-12|21:41:57] Regenerated local transaction journal    transactions=0 accounts=0
node-1  | INFO [09-12|22:41:57] Regenerated local transaction journal    transactions=0 accounts=0
node-1  | INFO [09-12|23:41:57] Regenerated local transaction journal    transactions=0 accounts=0
...

Using bootnodes from mainnet network docs:

enode://fd3da177f9492a39d1e7ce036b05745512894df251399cb3ec565081cb8c6dfa1092af8fac27991e66b6af47e9cb42e02420cc89f8549de0ce513ee25ebffc3a@3.212.20.0:30303
enode://97f0ca95a653e3c44d5df2674e19e9324ea4bf4d47a46b1d8560f3ed4ea328f725acec3fcfcb37eb11706cf07da669e9688b091f1543f89b2425700a68bc8876@3.212.20.0:30301
enode://b72927f349f3a27b789d0ca615ffe3526f361665b496c80e7cc19dace78bd94785fdadc270054ab727dbb172d9e3113694600dd31b2558dd77ad85a869032dea@188.166.207.189:30301
enode://c8f2f0643527d4efffb8cb10ef9b6da4310c5ac9f2e988a7f85363e81d42f1793f64a9aa127dbaff56b1e8011f90fe9ff57fa02a36f73220da5ff81d8b8df351@104.248.98.60:30301

ttibord avatar Sep 13 '24 20:09 ttibord

This issue had been addressed. If you are still encountered the issue. Please refer to this guideline here: https://docs.viction.xyz/how-to/how-to-troubleshoot-when-the-node-is-up-but-couldnt-begin-to-sync-block

hanker0x avatar Sep 18 '24 17:09 hanker0x

Thanks, @hanker0x, we synced archive node successfully. Here's a minimal working docker compose (update image before 15th Oct), for anyone else finding this:

services:
  node:
    image: tomochain/node:stable
    environment:
      IDENTITY: my_archive_node
      BOOTNODES: enode://fd3da177f9492a39d1e7ce036b05745512894df251399cb3ec565081cb8c6dfa1092af8fac27991e66b6af47e9cb42e02420cc89f8549de0ce513ee25ebffc3a@3.212.20.0:30303,enode://97f0ca95a653e3c44d5df2674e19e9324ea4bf4d47a46b1d8560f3ed4ea328f725acec3fcfcb37eb11706cf07da669e9688b091f1543f89b2425700a68bc8876@3.212.20.0:30301,enode://b72927f349f3a27b789d0ca615ffe3526f361665b496c80e7cc19dace78bd94785fdadc270054ab727dbb172d9e3113694600dd31b2558dd77ad85a869032dea@188.166.207.189:30301,enode://c8f2f0643527d4efffb8cb10ef9b6da4310c5ac9f2e988a7f85363e81d42f1793f64a9aa127dbaff56b1e8011f90fe9ff57fa02a36f73220da5ff81d8b8df351@104.248.98.60:30301
    volumes:
      - /viction/data:/tomochain/data
    ports:
      - "14545:8545"
      - "30303:30303"
      - "30303:30303/udp"
    command: [
      "--gcmode=archive",
      "--store-reward",
      "--rpcapi=db,eth,net,web3,debug,posv"
      ]

And here's the mounted DB directory structure, after downloading official snapshots:

/viction/data
├── tomo
│   ├── chaindata
│   ├── nodes   # node creates
│   └── rewards # node creates
└── tomox

ttibord avatar Oct 03 '24 10:10 ttibord

Hello @hanker0x There seems to be an issue with the bootnodes again and I see these errors when starting up my node

May 28 11:40:07 juju-b63373-0 tomo[1496742]: TRACE[05-28|11:40:07] Dial error                               task="dyndial fd3da177f9492a39 3.212.20.0:30303"      err="dial tcp 3.212.20.0:30303: i/o timeout"
May 28 11:40:07 juju-b63373-0 tomo[1496742]: TRACE[05-28|11:40:07] Dial task done                           task="dyndial fd3da177f9492a39 3.212.20.0:30303"
May 28 11:40:07 juju-b63373-0 tomo[1496742]: TRACE[05-28|11:40:07] Dial error                               task="dyndial 97f0ca95a653e3c4 104.248.98.78:30301"   err="dial tcp 104.248.98.78:30301: i/o timeout"
May 28 11:40:07 juju-b63373-0 tomo[1496742]: TRACE[05-28|11:40:07] Dial task done                           task="dyndial 97f0ca95a653e3c4 104.248.98.78:30301"
May 28 11:40:07 juju-b63373-0 tomo[1496742]: TRACE[05-28|11:40:07] Dial error                               task="dyndial b72927f349f3a27b 188.166.207.189:30301" err="dial tcp 188.166.207.189:30301: i/o timeout"
May 28 11:40:07 juju-b63373-0 tomo[1496742]: TRACE[05-28|11:40:07] Dial task done                           task="dyndial b72927f349f3a27b 188.166.207.189:30301"
May 28 11:40:07 juju-b63373-0 tomo[1496742]: TRACE[05-28|11:40:07] Dial error                               task="dyndial c8f2f0643527d4ef 104.248.98.60:30301"   err="dial tcp 104.248.98.60:30301: i/o timeout"
May 28 11:40:07 juju-b63373-0 tomo[1496742]: TRACE[05-28|11:40:07] Dial task done                           task="dyndial c8f2f0643527d4ef 104.248.98.60:30301"

Can also be confirmed when using a webpage to check for open ports or netcat e.g.

netcat -zv 3.212.20.0 30303
netcat: connect to 3.212.20.0 port 30303 (tcp) failed: Connection timed out

jonathanudd avatar May 28 '25 11:05 jonathanudd