mina icon indicating copy to clipboard operation
mina copied to clipboard

Mina Daemon keeps crashing in bootstrap

Open mina-payout opened this issue 4 years ago • 8 comments

Preliminary Checks

  • [X] This issue is not a duplicate. Before opening a new issue, please search existing issues: https://github.com/MinaProtocol/mina/issues
  • [X] This issue is not a question, feature request, RFC, or anything other than a bug report. Please post those things in GitHub Discussions: https://github.com/MinaProtocol/mina/discussions

Description

Have Mina daemon node setup on AWS instance, and it keeps failing to start. Opening the issues as mentioned in mina logs

Version: gcr.io/o1labs-192920/mina-daemon-baked:1.1.8-b10c0e3-mainnet

Steps to Reproduce

  1. Setup Mina daemon with snark worker enabled on AWS EC2 instance

  2. Mina daemon version: gcr.io/o1labs-192920/mina-daemon-baked:1.1.8-b10c0e3-mainnet

  3. Docker command: docker run --name mina -d -p 8302:8302 -p 3095:3095/tcp --restart=always --mount "type=bind,source=pwd/keys,dst=/home/ubuntu/keys,readonly" --mount "type=bind,source=pwd/.mina-config,dst=/home/ubuntu/" -e CODA_PRIVKEY_PASS="CODA_PRIVKEY_PASS" gcr.io/o1labs-192920/mina-daemon-baked:1.1.8-b10c0e3-mainnet daemon --metrics-port 6060 --block-producer-key /home/ubuntu/keys/my-wallet --insecure-rest-server --file-log-level Debug --log-level Info --peer-list-url https://storage.googleapis.com/mina-seed-lists/mainnet_seeds.txt --run-snark-worker public_key --snark-worker-fee 0.000979 --snark-worker-parallelism 4 --open-limited-graphql-port --limited-graphql-port 3095

  4. Crash report attached Uploading coda_crash_report_2021-11-25_10-42-56.127850.tar.gz…

Expected Result

Mina daemon should start normally

Actual Result

fails to start and sync

How frequently do you see this issue?

Always

What is the impact of this issue on your ability to run a node?

Blocker

Status

Mina daemon status
-----------------------------------

Max observed block height:              84229
Max observed unvalidated block height:  84229
Local uptime:                           11m14s
Chain id:                               5f704cc0c82e0ed70e873f0893d7e06f148524e3f0bdae2afb02e7819a0c24d1
Git SHA-1:                              b10c0e3db9112a2a8aebc3eec7c6d2570fcc4044
Configuration directory:                /root/.mina-config
Peers:                                  0
User_commands sent:                     0
SNARK worker:                           --
SNARK work fee:                         979000
Sync status:                            Bootstrap
Block producers running:                1 (--)
Consensus time now:                     epoch=17, slot=281
Consensus mechanism:                    proof_of_stake
Consensus configuration:
        Delta:                     0
        k:                         290
        Slots per epoch:           7140
        Slot duration:             3m
        Epoch duration:            14d21h
        Chain start timestamp:     2021-03-17 00:00:00.000000Z
        Acceptable network delay:  3m

Addresses and ports:
        External IP:    --
        Bind IP:        0.0.0.0
        Libp2p PeerID:  12D3KooWGBZ4GMiCD2BwZWDbk4DAxc28QhU25wFXQTduRT4NLNyW
        Libp2p port:    8302
        Client port:    8301

Additional information

coda_crash_report_2021-11-25_10-42-56.127850.tar.gz

mina-payout avatar Nov 25 '21 11:11 mina-payout

This issue was observed on a very old release. I will close this for now. Feel free to reopen if you observe this on the latest releases.

p-shahi avatar May 09 '22 16:05 p-shahi

I had this problem with Mainnet Stable Release 1.3.1 as well.

leezijie avatar Jul 07 '22 09:07 leezijie

`Mina daemon status

Max observed block height: 155411 Max observed unvalidated block height: 155411 Local uptime: 3h14m13s Chain id: 5f704cc0c82e0ed70e873f0893d7e06f148524e3f0bdae2afb02e7819a0c24d1 Git SHA-1: 3e3abecd4fd197017321d61a65a25f0bbdc40f3a Configuration directory: /data/.mina-config Peers: 27 User_commands sent: 2 SNARK worker: None SNARK work fee: 100000000 Sync status: Bootstrap Block producers running: 0 Coinbase receiver: Block producer Consensus time now: epoch=32, slot=680 Consensus mechanism: proof_of_stake Consensus configuration: Delta: 0 k: 290 Slots per epoch: 7140 Slot duration: 3m Epoch duration: 14d21h Chain start timestamp: 2021-03-17 00:00:00.000000Z Acceptable network delay: 3m

Addresses and ports: External IP: 52.194.8.37 Bind IP: 0.0.0.0 Libp2p PeerID: 12D3KooWQiEaxZbQJ6PyUsVnxQNeHgAS728PSvQi7ictPxgHsnXV Libp2p port: 10101 Client port: 8301`

leezijie avatar Jul 07 '22 10:07 leezijie

`Mina daemon status

Max observed block height: 155411 Max observed unvalidated block height: 155411 Local uptime: 3h14m13s Chain id: 5f704cc0c82e0ed70e873f0893d7e06f148524e3f0bdae2afb02e7819a0c24d1 Git SHA-1: 3e3abec Configuration directory: /data/.mina-config Peers: 27 User_commands sent: 2 SNARK worker: None SNARK work fee: 100000000 Sync status: Bootstrap Block producers running: 0 Coinbase receiver: Block producer Consensus time now: epoch=32, slot=680 Consensus mechanism: proof_of_stake Consensus configuration: Delta: 0 k: 290 Slots per epoch: 7140 Slot duration: 3m Epoch duration: 14d21h Chain start timestamp: 2021-03-17 00:00:00.000000Z Acceptable network delay: 3m

Addresses and ports: External IP: 52.194.8.37 Bind IP: 0.0.0.0 Libp2p PeerID: 12D3KooWQiEaxZbQJ6PyUsVnxQNeHgAS728PSvQi7ictPxgHsnXV Libp2p port: 10101 Client port: 8301`

You can see that 3 hours is still bootstrap

leezijie avatar Jul 07 '22 10:07 leezijie

@p-shahi

leezijie avatar Jul 07 '22 10:07 leezijie

What size of aws instance was it? This is likely caused by the verifier process crashing or running out of resources and preventing you from verifying the block that you bootstrap to.

lk86 avatar Jul 12 '22 19:07 lk86

What size of aws instance was it? This is likely caused by the verifier process crashing or running out of resources and preventing you from verifying the block that you bootstrap to.

ubuntu 20. docker. 8c 16g 1t

leezijie avatar Jul 13 '22 02:07 leezijie

Hey @mina-payout and @leezijie, is it possible (on your end) to try to replicate an issue using the AWS instances types like m4.2xlarge or m5.2xlarge?

shimkiv avatar Aug 07 '22 16:08 shimkiv

Closing this issue because of being inactive for long time. Please feel free to open another ticket should you experience any further issues.

shimkiv avatar Sep 19 '22 15:09 shimkiv