cardano-db-sync
cardano-db-sync copied to clipboard
Improve error message when db-sync shuts down becasue the node tip is behind db-sync
This is something that can be often overlooked but when restoring state for db-sync from snapshot and starting service - if cardano-node tip is behind db-sync this will result in error message and service shut down.
Error message could be improved and suggest the source of this problem.
Here is an example of how it looks now when cardano-node tip is behind db-sync for 13.1.0.0-rc2 when starting service after sucessful snapshot restoration:
export DbSyncAbortOnPanic=1; PGPASSFILE=config/pgpass-mainnet-restoration db-sync-node/bin/cardano-db-sync --config config/mainnet-config.yaml --socket-path ../cardano-node/mainnet/node.socket --schema-dir schema/ --state-dir ledger-state/mainnet_restoration/ | tee db_sync_upgrade_from_snapshot_logs2.txt
[db-sync-node:Info:6] [2022-11-24 10:37:23.28 UTC] Version number: 13.1.0.0
[db-sync-node:Info:6] [2022-11-24 10:37:23.28 UTC] Git hash: 3640e5aa00b8bada61e5d4b6eee063749866711b
[db-sync-node:Info:6] [2022-11-24 10:37:23.28 UTC] Option disable-ledger: False
[db-sync-node:Info:6] [2022-11-24 10:37:23.28 UTC] Option disable-cache: False
[db-sync-node:Info:6] [2022-11-24 10:37:23.28 UTC] Option disable-epoch: False
[db-sync-node:Info:6] [2022-11-24 10:37:23.28 UTC] Option skip-plutus-data-fix: False
[db-sync-node:Info:6] [2022-11-24 10:37:23.28 UTC] Option only-plutus-data-fix: False
[db-sync-node:Info:6] [2022-11-24 10:37:23.28 UTC] Option force-indexes: False
[db-sync-node:Info:6] [2022-11-24 10:37:23.28 UTC] Enviroment variable DbSyncAbortOnPanic: True
[db-sync-node:Info:6] [2022-11-24 10:37:23.28 UTC] Schema migration files validated
[db-sync-node:Info:6] [2022-11-24 10:37:23.28 UTC] Running database migrations in mode Initial
[db-sync-node:Info:6] [2022-11-24 10:37:23.28 UTC] Found maintenance_work_mem=2GB, max_parallel_maintenance_workers=2
[db-sync-node:Info:6] [2022-11-24 10:37:24.58 UTC] Some migrations were not executed. They need to run when syncing has started.
[db-sync-node:Info:6] [2022-11-24 10:37:24.58 UTC] New user indexes were not created. They may be created later if necessary.
[db-sync-node:Info:6] [2022-11-24 10:37:24.58 UTC] Using byron genesis file from: "config/../../cardano-node/mainnet/byron-genesis.json"
[db-sync-node:Info:6] [2022-11-24 10:37:24.58 UTC] Using shelley genesis file from: "config/../../cardano-node/mainnet/shelley-genesis.json"
[db-sync-node:Info:6] [2022-11-24 10:37:24.58 UTC] Using alonzo genesis file from: "config/../../cardano-node/mainnet/alonzo-genesis.json"
[db-sync-node:Info:6] [2022-11-24 10:37:24.97 UTC] NetworkMagic: 764824073
[db-sync-node:Info:6] [2022-11-24 10:37:25.30 UTC] Initial genesis distribution present and correct
[db-sync-node:Info:6] [2022-11-24 10:37:25.30 UTC] Total genesis supply of Ada: 31112484745.000000
[db-sync-node:Info:6] [2022-11-24 10:37:25.30 UTC] Inserting Shelley Genesis distribution
[db-sync-node:Info:6] [2022-11-24 10:37:25.30 UTC] epochStartup: Checking
[db-sync-node:Info:6] [2022-11-24 10:37:25.30 UTC] localInitiatorNetworkApplication: connecting to node via "../cardano-node/mainnet/node.socket"
[db-sync-node.Subscription:Notice:48] [2022-11-24 10:37:25.30 UTC] Identity Starting Subscription Worker, valency 1
[db-sync-node.Subscription:Notice:49] [2022-11-24 10:37:25.30 UTC] Identity Connection Attempt Start, destination LocalAddress "../cardano-node/mainnet/node.socket"
[db-sync-node.Subscription:Notice:49] [2022-11-24 10:37:25.30 UTC] Identity Connection Attempt End, destination LocalAddress "../cardano-node/mainnet/node.socket" outcome: ConnectSuccessLast
[db-sync-node.Handshake:Info:49] [2022-11-24 10:37:25.30 UTC] WithMuxBearer (ConnectionId {localAddress = LocalAddress "", remoteAddress = LocalAddress "mainnet/node.socket"}) Send (ClientAgency TokPropose,MsgProposeVersions (fromList [(NodeToClientV_9,TInt 764824073),(NodeToClientV_10,TInt 764824073),(NodeToClientV_11,TInt 764824073),(NodeToClientV_12,TInt 764824073),(NodeToClientV_13,TInt 764824073)]))
[db-sync-node.Handshake:Info:49] [2022-11-24 10:37:25.32 UTC] WithMuxBearer (ConnectionId {localAddress = LocalAddress "", remoteAddress = LocalAddress "mainnet/node.socket"}) Recv (ServerAgency TokConfirm,MsgAcceptVersion NodeToClientV_13 (TInt 764824073))
[db-sync-node.Mux:Info:49] [2022-11-24 10:37:25.32 UTC] Bearer on ConnectionId {localAddress = LocalAddress "", remoteAddress = LocalAddress "mainnet/node.socket"} event: Handshake Client end, duration 0.023616872s
[db-sync-node.Mux:Info:50] [2022-11-24 10:37:25.32 UTC] Bearer on ConnectionId {localAddress = LocalAddress "", remoteAddress = LocalAddress "mainnet/node.socket"} event: State: Mature
[db-sync-node.Mux:Info:50] [2022-11-24 10:37:25.32 UTC] Bearer on ConnectionId {localAddress = LocalAddress "", remoteAddress = LocalAddress "mainnet/node.socket"} event: Eagerly started (MiniProtocolNum 5) in InitiatorDir
[db-sync-node.Mux:Info:50] [2022-11-24 10:37:25.32 UTC] Bearer on ConnectionId {localAddress = LocalAddress "", remoteAddress = LocalAddress "mainnet/node.socket"} event: Eagerly started (MiniProtocolNum 6) in InitiatorDir
[db-sync-node.Mux:Info:50] [2022-11-24 10:37:25.32 UTC] Bearer on ConnectionId {localAddress = LocalAddress "", remoteAddress = LocalAddress "mainnet/node.socket"} event: Eagerly started (MiniProtocolNum 7) in InitiatorDir
[db-sync-node.Mux:Info:50] [2022-11-24 10:37:25.32 UTC] Bearer on ConnectionId {localAddress = LocalAddress "", remoteAddress = LocalAddress "mainnet/node.socket"} event: Eagerly started (MiniProtocolNum 9) in InitiatorDir
[db-sync-node.Subscription:Notice:48] [2022-11-24 10:37:25.32 UTC] Identity Required subscriptions started
[db-sync-node:Info:53] [2022-11-24 10:37:25.32 UTC] Setting ConsistencyLevel to Unchecked
[db-sync-node:Info:53] [2022-11-24 10:37:25.32 UTC] Starting the fixing Plutus Data bytes procedure. This may take a couple hours on mainnet if there are wrong values. You can skip it using --skip-plutus-data-fix. It will fix Datum and RedeemerData with wrong bytes. See more in Issue #1214 and #1278. This procedure makes resyncing unnecessary.
[db-sync-node:Info:53] [2022-11-24 10:37:25.32 UTC] Trying to find Datum with wrong bytes
[db-sync-node:Info:53] [2022-11-24 10:37:25.94 UTC] There are 6362320 Datum. Need to scan them all.
[db-sync-node:Info:53] [2022-11-24 10:37:48.35 UTC] Found some wrong values already. The oldest ones are (hash, bytes): [("2512243dcf7a3c3af70e50cdeea19d952e093d2395d95af67d21ed6b92224562","d87a9fd8799f581cd52e11f3e48436dd42dbec6d88c239732e503b8b7a32af58e5f87625413001ffff"),("f7f2f57c58b5e4872201ab678928b0d63935e82d022d385e1bad5bfe347e89d8","d87980"),("50707eebc0b23a108309c74e70874daf3c01bc180ecd9d5ee93293b53d456cfa","d87b9fd8799f581cd52e11f3e48436dd42dbec6d88c239732e503b8b7a32af58e5f87625433339391a17d78400ffff"),("535349dcf2548991b02ea37b4375fc7472ad2f554d061a0e377fcad472e28616","d87a9fd8799f581cd52e11f3e48436dd42dbec6d88c239732e503b8b7a32af58e5f87625443230303001ffff"),("74bf0fc59a9953a426a1a5f25d952ebc83c03e8fc3c10ce2d49961788388c074","d87a9fd8799f581cd52e11f3e48436dd42dbec6d88c239732e503b8b7a32af58e5f87625443234373901ffff")]
[db-sync-node:Info:53] [2022-11-24 10:41:01.15 UTC] Checked 1000000 Datum
[db-sync-node:Info:53] [2022-11-24 10:44:54.33 UTC] Checked 2000000 Datum
[db-sync-node:Info:53] [2022-11-24 10:46:40.00 UTC] Checked 3000000 Datum
[db-sync-node:Info:53] [2022-11-24 10:47:21.19 UTC] Checked 4000000 Datum
[db-sync-node:Info:53] [2022-11-24 10:48:08.27 UTC] Checked 5000000 Datum
[db-sync-node:Info:53] [2022-11-24 10:48:58.38 UTC] Checked 6000000 Datum
[db-sync-node:Info:53] [2022-11-24 10:49:18.30 UTC] Found 1394785 Datum with mismatch between bytes and hash.
[db-sync-node:Info:53] [2022-11-24 10:49:18.30 UTC] Trying to find RedeemerData with wrong bytes
[db-sync-node:Info:53] [2022-11-24 10:49:18.31 UTC] There are 52241 RedeemerData. Need to scan them all.
[db-sync-node:Info:53] [2022-11-24 10:49:18.98 UTC] Found some wrong values already. The oldest ones are (hash, bytes): [("b3d5894a64d4b5718c2a1b0c32ba302319b983647f9c2928c115d94c6e63ab72","d87b80"),("883f093b3639a82c7a6aefbc79aedcb7d8a2d561f772d5ab2fe9431a34c5f7a3","d87c80"),("f7f2f57c58b5e4872201ab678928b0d63935e82d022d385e1bad5bfe347e89d8","d87980"),("bade4f39af999eb49aca23e1c12f5c5bf1a16ebf63a85f7b9e17b4ff3d82886a","d87a80"),("ccb127aef877cfc0abeb3633875fec21dfd357a2319808a634e9c7cd1164b4a2","00")]
[db-sync-node:Info:53] [2022-11-24 10:49:18.98 UTC] Found 228 RedeemerData with mismatch between bytes and hash.
[db-sync-node:Info:53] [2022-11-24 10:49:18.98 UTC] Starting chainsync to fix Plutus Data. This will update database values in tables datum and redeemer_data.
[db-sync-node:Info:53] [2022-11-24 10:49:18.98 UTC] Starting fixing Plutus Data At (Block {blockPointSlot = SlotNo 49184126, blockPointHash = 92a76285fb2666d40a0c05f27d4cae91a6345e10188d4dca9937c4b35fb19347})
[db-sync-node:Error:53] [2022-11-24 10:49:19.04 UTC] ChainSyncWithBlocksPtcl: recvMsgIntersectNotFound
CallStack (from HasCallStack):
error, called at src/Cardano/DbSync/Sync.hs:452:67 in cardano-db-sync-13.1.0.0-2UdzDeVCguxIYNhseXdjDD:Cardano.DbSync.Sync
[db-sync-node.Mux:Info:50] [2022-11-24 10:49:19.04 UTC] Bearer on ConnectionId {localAddress = LocalAddress "", remoteAddress = LocalAddress "mainnet/node.socket"} event: State: Dead
[db-sync-node.Mux:Notice:50] [2022-11-24 10:49:19.04 UTC] Bearer on ConnectionId {localAddress = LocalAddress "", remoteAddress = LocalAddress "mainnet/node.socket"} event: Miniprotocol MiniProtocolNum 5 InitiatorDir terminated with exception recvMsgIntersectNotFound
CallStack (from HasCallStack):
error, called at src/Cardano/DbSync/Sync.hs:452:67 in cardano-db-sync-13.1.0.0-2UdzDeVCguxIYNhseXdjDD:Cardano.DbSync.Sync
[db-sync-node.Subscription:Error:49] [2022-11-24 10:49:19.04 UTC] Identity Application Exception: LocalAddress "../cardano-node/mainnet/node.socket" recvMsgIntersectNotFound
CallStack (from HasCallStack):
error, called at src/Cardano/DbSync/Sync.hs:452:67 in cardano-db-sync-13.1.0.0-2UdzDeVCguxIYNhseXdjDD:Cardano.DbSync.Sync
cardano-db-sync: recvMsgIntersectNotFound
CallStack (from HasCallStack):
error, called at src/Cardano/DbSync/Sync.hs:452:67 in cardano-db-sync-13.1.0.0-2UdzDeVCguxIYNhseXdjDD:Cardano.DbSync.Sync[db-sync-node.ErrorPolicy:Error:6] [2022-11-24 10:49:19.04 UTC] IP LocalAddress "../cardano-node/mainnet/node.socket" ErrorPolicyLocalNodeError (ApplicationExceptionTrace recvMsgIntersectNotFound
CallStack (from HasCallStack):
error, called at src/Cardano/DbSync/Sync.hs:452:67 in cardano-db-sync-13.1.0.0-2UdzDeVCguxIYNhseXdjDD:Cardano.DbSync.Sync)
This error message could indicate that the node is behind and block for some seconds, so that the fail loop doesn't happen that often.
This is fixed on 13.1.0.0. Shoul we close @ArturWieczorek?