Bor sync stuck at block 0x312d050
System information
Bor client version: 1.2.1
Heimdall client version: 1.0.3
OS & Version: Linux
Environment: Polygon Mainnet
Type of node: Full
Overview of the problem
I am running a full node using bor and heimdall via docker the last 2 months but seems that the bor sync stucks 11h ago at block 0x312d050. I am getting following logs from bor docker image:
bor | WARN [12-26|16:31:24.814] unable to handle whitelist milestone err="missing blocks"
bor | INFO [12-26|16:31:36.814] Got new milestone from heimdall start=51,584,847 end=51,584,869 hash=0x112ae9614d96a0db2fb572d324f1ca505983ef0b309b1c0970f698994964bb89
bor | WARN [12-26|16:31:36.815] unable to handle whitelist milestone err="missing blocks"
bor | INFO [12-26|16:31:40.819] Got new checkpoint from heimdall start=51,583,142 end=51,583,653 rootHash=0xbaa9de2414f3853a1be0556bd33ca614024e6a8b864940a482e2c84fa1527bf1
bor | WARN [12-26|16:31:40.819] Failed to whitelist checkpoint err="missing blocks"
bor | WARN [12-26|16:31:40.819] unable to handle whitelist checkpoint err="missing blocks"
bor | INFO [12-26|16:31:48.813] Got new milestone from heimdall start=51,584,847 end=51,584,869 hash=0x112ae9614d96a0db2fb572d324f1ca505983ef0b309b1c0970f698994964bb89
bor | WARN [12-26|16:31:48.813] unable to handle whitelist milestone err="missing blocks"
bor | INFO [12-26|16:32:00.814] Got new milestone from heimdall start=51,584,847 end=51,584,869 hash=0x112ae9614d96a0db2fb572d324f1ca505983ef0b309b1c0970f698994964bb89
bor | WARN [12-26|16:32:00.815] unable to handle whitelist milestone err="missing blocks"
bor | INFO [12-26|16:32:12.814] Got new milestone from heimdall start=51,584,847 end=51,584,869 hash=0x112ae9614d96a0db2fb572d324f1ca505983ef0b309b1c0970f698994964bb89
bor | WARN [12-26|16:32:12.814] unable to handle whitelist milestone err="missing blocks"
bor | INFO [12-26|16:32:24.814] Got new milestone from heimdall start=51,584,870 end=51,584,892 hash=0xdef7276b17971f87470ffa0c516ec2a1de75fd12564106af2771f084d7bc63e8
bor | WARN [12-26|16:32:24.814] unable to handle whitelist milestone err="missing blocks"
bor | INFO [12-26|16:32:36.815] Got new milestone from heimdall start=51,584,870 end=51,584,892 hash=0xdef7276b17971f87470ffa0c516ec2a1de75fd12564106af2771f084d7bc63e8
bor | WARN [12-26|16:32:36.815] unable to handle whitelist milestone err="missing blocks"
bor | WARN [12-26|16:32:44.111] Snapshot extension registration failed peer=5f67ba47 err="peer connected on snap without compatible eth support"
bor | INFO [12-26|16:32:48.815] Got new milestone from heimdall start=51,584,870 end=51,584,892 hash=0xdef7276b17971f87470ffa0c516ec2a1de75fd12564106af2771f084d7bc63e8
bor | WARN [12-26|16:32:48.815] unable to handle whitelist milestone err="missing blocks"
bor | INFO [12-26|16:33:00.814] Got new milestone from heimdall start=51,584,893 end=51,584,911 hash=0x72137465e871305c04cae0d017d60848a17b4f70caa302f7d3b8e55615a8ac54
bor | WARN [12-26|16:33:00.814] unable to handle whitelist milestone err="missing blocks"
Any idea how can i fix it? I tried to restart docker image but the error remains.
same
I tried also to debug.setHead to change head to some blocks (1000+ behind) and for some hours started sync again, but then stopped again.
Any idea how to fix it?
Same issue.
Same version for bor and heimdall. Although stuck at a different height(51640301), constantly getting WARN log with
Dec 28 14:30:33 203078 bor[1828860]: WARN [12-28|14:30:33.977] unable to handle whitelist milestone err="missing blocks"
Can you check your peers using ipc and admin.peers command ?
@0xKrishna
enode://11e0cbb03a834019b0222f54bccf32512bef4294dd722642684762d1d01c84031c1075767195d9968dcdb9e38326f08b14547d8e33b0b67a0ef1aa0b045845d0@35.171.120.130:30303?discport=30315,
enode://b0f026f7ccfd5c1450e933572ae44b262a7d084647a30d0a8d9e2c8cab8d5b1c7721f3c60bfcd50c0fede114c7e2d316649389ba2449ca85d1ddd9e2947f1c28@147.135.100.106:30303?discport=30334,
enode://2d4bd1fa38182fa868a583fc946c8d5e4043b013381cf20927c16cf8f17b4f3e793c5e9f34fc785c52d887aab07181bdb0ebae50d9e3f05e5c14aed19f81929a@65.108.127.87:30303?discport=30340,
enode://ab879b4eaacf495ec760f2806e78509da80e327ba4262d8153698f88b0a95287a692bbaf3a3cece9ad27f889246c04e2b5ca8e75bf083acbb4806eb669cc3a77@35.171.120.130:30303?discport=30334,
enode://1a69f7dae12959a358b92a395ec79de2ab4601a59a5b0b951d4e6247da2101d7d6d77a919086251e70b552a49ae74d630e19233306a189a1b627c2115ecf3cfa@34.203.27.246:30303?discport=30320,
enode://574a9195f40a7c4bd68536167ef53a7385bab8934dfc8db94d013b1a73af76eb73f148536cb8b8365e8240728f6e80af0ddb4ead3a2544de907cce561839ce61@51.81.217.117:30303?discport=30323,
enode://142cce22e125325f4895b2268e32185f5dbe90f9c818ab135f16c7face23a55b46d0b78a0286595a262d4fa58ff314e7e2553e13f528a3c3e9616184b77f5b85@65.108.127.87:30303?discport=30323,
enode://50c8f9d2849a209383edd15dfd67ba0a8d3f5e9853fd1af9c1678f4aef2dc5e3817c34ddce9390d5e8dd4891ad7f66003a3bea5af9e288df6f26ed070d9bd741@54.38.217.112:30303?discport=30335,
enode://72be2da5ba01bc2f3a7764bf1d4f18550a36df629820ea0f6d37fe1cd1355d0f1c201b2a5f382e794ee56e0f5befa504e85e96548a45a0fba44bb6bd1075e28e@54.38.155.225:30303?discport=30306,
enode://53b53f55f2a1674873f8f58ee23616db8384f278a1206cf79c8c18d4ebc32b4424128229de2ea999803c08c9262974f1fb1f2b0d87ca6ec40aea1594c0ba0ef7@65.108.1.189:30303?discport=30337,
enode://eb0ee5596ea6df526eb7e0ace41f015bcb9ee4f27996c72ea15d1cd28ec69f89b6e64247696c0150111b52ca58810f5d0f42d59ac38fdb26ba7323bcc835475a@51.81.196.100:30303?discport=30313,
enode://c4a2a7c422ddce70a39164ce53762262bd5dc8917f5613b1c92c94affb36516e63f88721763a1dcfed5f36403e0fc21894e34c2981f2f6f1f100b9f186a986a1@51.38.72.15:30303?discport=30307,
enode://2197472b27c39587e2ae2c199e91527a25d25b2c1217f14c8d8b342068209a889913c7c1eb6f60044a0d28bd59ccec157d18ebb7918293e8878d11185831cf22@54.38.75.21:30303?discport=30320,
enode://b6d9bef47ce86b94331cdcfd2a1a91f28ab48db171aa70659973b3869988e7e4806fd24406c6f57187664643dffc0edf74e7a16ac315ca7933589357ec875550@51.38.72.15:30303?discport=30311,
enode://4585b746a2ae2f74575313199bd35159e8b679608fa1bd4e3a2823c0c24f8e49f9cb1e0c312de30a8b08c16a6666101897ffff47a6c162dca6ddb87c206c4cd2@66.70.233.151:30303?discport=30313,
enode://c8ab3d6ec8d7c1c7df462f55f02acaced2949ec4542475fa25ebb104feaa78a196f0e39cfc2bf1236ead1c647b734726cb9f4f03eb933c94f318cca160e5ce16@54.38.217.112:30303?discport=30334
Can you check your peers using ipc and admin.peers command ?
Sure. I rolled back to bor v1.1.0 from v1.2.1 because some issues said that rollingback might be a solution. and then found an interesting performance every time I restart bor service, it syncs for a while and then stuck with the above "whitelist milestone" log.
> admin.peers [{ caps: ["eth/66", "eth/67", "eth/68", "snap/1"], enode: "enode://6de3bbba54699dcc11b982c7970fdd938946d3638bab27d9006698b998447cf838891a310c61d6c74d042091366ba07690ef1f09a026fab28a31a06cd387b67b@13.57.125.97:30321", enr: "enr:-KO4QIY12LW3IWDW2JzqMdtg9Pyv7PEASdnlLFAEzUEuzOgVEvW5hWe2EB_Jd6iqKnRHi_SyP1INx6iDk3a6CMoyqOqGAYvR7YGvg2V0aMfGhNwIhlyAgmlkgnY0gmlwhA05fWGJc2VjcDI1NmsxoQNt47u6VGmdzBG5gseXD92TiUbTY4urJ9kAZpi5mER8-IRzbmFwwIN0Y3CCdnGDdWRwgnZx", id: "136d74cf29e85b49f991b1d97b5800f1a45968b0542642c47c970c1502762313", name: "bor/v1.1.0/linux-amd64/go1.20.10", network: { inbound: false, localAddress: "172.18.35.78:37836", remoteAddress: "13.57.125.97:30321", static: false, trusted: false }, protocols: { eth: { version: 68 }, snap: { version: 1 } } }, { caps: ["eth/66", "eth/67", "eth/68", "snap/1"], enode: "enode://b8187a46754cdf631d67b89e3e73d5e061ab2ce5a62cc8a79cfd754b04dc5394b381f1d99d59a8b6baeb68b4c019512b59dcbdc0cb682320f96508331cf8e8f3@54.38.217.112:30303?discport=30324", id: "1c405a70749de50ea441c6c59c07e7d4dde5e18f47102a20b88db98cddcbb6a2", name: "bor/v1.0.6/linux-amd64/go1.20.8", network: { inbound: false, localAddress: "172.18.35.78:51320", remoteAddress: "54.38.217.112:30303", static: false, trusted: false }, protocols: { eth: { version: 68 }, snap: { version: 1 } } }, { caps: ["eth/66", "eth/67", "eth/68", "snap/1"], enode: "enode://256fe3efb2f83e4821f4d028273757e525da48bb69a3da5c4230a410d5b96e948a79ae42e60a4914092249ee3bb928756534c67b6c3003f0d08a180373735edc@65.108.1.189:30303?discport=30395", enr: "enr:-KO4QHQlnI0aegmfJbdsiPIskZywzNjBmulaKf9scy3wuCR_XirUnjEjwSsDfjJe40LWodLNpjLDW48N4MtdFEXOXh6GAYx2yUm_g2V0aMfGhNwIhlyAgmlkgnY0gmlwhEFsAb2Jc2VjcDI1NmsxoQIlb-Pvsvg-SCH00CgnN1flJdpIu2mj2lxCMKQQ1blulIRzbmFwwIN0Y3CCdl-DdWRwgna7", id: "3e8f038a2af1414377f24cacf7e6591b4007c60b8de292b7bec24d7a27cd9c49", name: "bor/v1.0.6/linux-amd64/go1.20.8", network: { inbound: false, localAddress: "172.18.35.78:52390", remoteAddress: "65.108.1.189:30303", static: false, trusted: false }, protocols: { eth: { version: 68 }, snap: { version: 1 } } }, { caps: ["eth/66", "eth/67", "eth/68", "snap/1"], enode: "enode://2cd2be98b78f486171994f32ca995f4d53a783172f360a9224181c3cb1b487bd88e95658cb05405642ee2455fc31ae0919f8b2699cc02ed9ed2aef09b9fc93c2@54.38.216.84:30303?discport=30331", enr: "enr:-KO4QN1KbAC8kuy161pxm8kHqtI8VMjk9cQjVFJT4s6TH3G-LJK4QAdY7LqugQ8Yt8-hYUzFDrqoaMFR3xQVhQHoH46GAYyGmlAzg2V0aMfGhNwIhlyAgmlkgnY0gmlwhDYm2FSJc2VjcDI1NmsxoQIs0r6Yt49IYXGZTzLKmV9NU6eDFy82CpIkGBw8sbSHvYRzbmFwwIN0Y3CCdl-DdWRwgnZ7", id: "496c218828d2d1864a9e228e7ad33a481ae60acb81becfb2e565053f4e1f1a5c", name: "bor/v1.0.6/linux-amd64/go1.20.8", network: { inbound: false, localAddress: "172.18.35.78:47924", remoteAddress: "54.38.216.84:30303", static: false, trusted: false }, protocols: { eth: { version: 68 }, snap: { version: 1 } } }, { caps: ["eth/66", "eth/67", "eth/68", "snap/1"], enode: "enode://994252f3fbe56302ba967cab1f01fada30ef8fdb335e6f974a55dd258c2052d1c8c7f181c147d3958ca7e5c7aec76f4f316f50891b137dcbcfd811e453f9d8cc@135.125.214.37:30303?discport=30340", id: "6bcba20976d073441dfdda8631ddf8fc0db9056e00485e8fe49717dac36560df", name: "bor/v1.0.6/linux-amd64/go1.20.8", network: { inbound: false, localAddress: "172.18.35.78:44738", remoteAddress: "135.125.214.37:30303", static: false, trusted: false }, protocols: { eth: { version: 68 }, snap: { version: 1 } } }, { caps: ["eth/66", "eth/67", "eth/68", "snap/1"], enode: "enode://29e354ff99595687d321d44b72c0e458f481046edd8d18fc5db69df0d61a44068ce9c715d74651d7c635688962f54251af861b13e5b31b4da54bb2c9f05ac794@5.9.87.183:30303?discport=30495", enr: "enr:-KO4QKwM2X_BENPlgEwVZ9SQjAMLtFF1dbJe9lmJ7eW42ai2R7ZAQ6Gc4Xzy2_BJOXsA8sESHmXeLvCGIINbAqjPxDWGAYyF1OC3g2V0aMfGhNwIhlyAgmlkgnY0gmlwhAUJV7eJc2VjcDI1NmsxoQIp41T_mVlWh9Mh1EtywORY9IEEbt2NGPxdtp3w1hpEBoRzbmFwwIN0Y3CCdl-DdWRwgncf", id: "6f1be92e4e8cb5f36e2d2e988d60d492a5992524258fab93ae146a335a8f690a", name: "bor/v1.0.6/linux-amd64/go1.20.8", network: { inbound: false, localAddress: "172.18.35.78:54748", remoteAddress: "5.9.87.183:30303", static: false, trusted: false }, protocols: { eth: { version: 68 }, snap: { version: 1 } } }, { caps: ["eth/66", "eth/67", "eth/68", "snap/1"], enode: "enode://b9e2f920d31ea6cde2ad56fcd1904455d911ccf58201551c22d41c28f5a1b1d20a67c8db30893651d8a47bfe21a95705505c079892290a8cfad06f1b8c425628@44.221.198.244:30303?discport=30316", id: "7752490f98a21bde471c9151b7bfe28347cf83a0813a9fe6e66320ae63152f5b", name: "bor/v1.0.6/linux-amd64/go1.20.8", network: { inbound: false, localAddress: "172.18.35.78:41940", remoteAddress: "44.221.198.244:30303", static: false, trusted: false }, protocols: { eth: { version: 68 }, snap: { version: 1 } } }, { caps: ["eth/66", "eth/67", "eth/68", "snap/1"], enode: "enode://9f1443433c1b1b79ccc2d95f314c4e0823d0b549d1db43e5e0a2fe3a87fdaeb2d693fa4a8e75fd6a77c2917598d91782fb75b8fc6357c4f13073653894418acb@66.70.207.63:30303?discport=30309", id: "8df6a54d5bc8fcac07f8ece1d738414190fc9fe3400776abb33471b9ead46344", name: "bor/v1.0.6/linux-amd64/go1.20.8", network: { inbound: false, localAddress: "172.18.35.78:53838", remoteAddress: "66.70.207.63:30303", static: false, trusted: false }, protocols: { eth: { version: 68 }, snap: { version: 1 } } }, { caps: ["eth/66", "eth/67", "eth/68", "snap/1"], enode: "enode://6668bb0a2ede7963ebc196f5e2c8e4daf480a1b7510b74ad18491d733ccf32ab754b44422e4d40fb88c996a3d33fa08dc96461d77693c4a7976cadef4340ca71@148.113.163.85:30303?discport=30309", id: "8e60fc39583410b077016422c96f36ecc60f077a4910a8848917dd1e5856c4e4", name: "bor/v1.0.6/linux-amd64/go1.20.8", network: { inbound: false, localAddress: "172.18.35.78:36704", remoteAddress: "148.113.163.85:30303", static: false, trusted: false }, protocols: { eth: { version: 68 }, snap: { version: 1 } } }, { caps: ["eth/66", "eth/67", "eth/68", "snap/1"], enode: "enode://298ba98e471a44af8638c297d4f25060119817d20cd49870717cfef0f92d3d3d1e3039b1b5fcd34ef66e5ef97efefb9d38e68eed20d1eec5929dfc422a3731e9@3.219.138.93:30306", id: "90871a5e7b702d78f49f829b75d44728628d6a0448d2e128dee96d3e8a39383e", name: "bor/v1.1.0/linux-amd64/go1.20.10", network: { inbound: false, localAddress: "172.18.35.78:39982", remoteAddress: "3.219.138.93:30306", static: false, trusted: false }, protocols: { eth: { version: 68 }, snap: { version: 1 } } }, { caps: ["eth/66", "eth/67", "eth/68", "snap/1"], enode: "enode://66153dd3af7f793158934d9bd121f68e1e8c5a4c15d3316f2e222e6743f8a46fb02a3b6e70181521c0f82584ebd8b690fcf7c3056d5b78293f1bbe065f038ed9@54.235.96.140:30306", id: "93c951775b564631f98affc9e4539b91daa825e350de64a3a0b760a65d0a7826", name: "bor/v1.1.0/linux-amd64/go1.20.10", network: { inbound: false, localAddress: "172.18.35.78:37624", remoteAddress: "54.235.96.140:30306", static: false, trusted: false }, protocols: { eth: { version: 68 }, snap: { version: 1 } } }, { caps: ["eth/66", "eth/67", "eth/68", "snap/1"], enode: "enode://697850d0a936d1d63d047ce480e6f39f429f2c33cfeec335526fb1e97aa0a11a43065bad4b0e8223ca053f91307a0a672d79586c4efdb81f531122116e6d132f@15.204.47.194:30303?discport=30340", id: "96b764ec1ca7771bdb60b464e498824b22dfc7c7cd8d8a3c28cb9ce4241d72dc", name: "bor/v1.0.6/linux-amd64/go1.20.8", network: { inbound: false, localAddress: "172.18.35.78:33882", remoteAddress: "15.204.47.194:30303", static: false, trusted: false }, protocols: { eth: { version: 68 }, snap: { version: 1 } } }, { caps: ["eth/66", "eth/67", "eth/68", "snap/1"], enode: "enode://a34a45e54b28eef5cc58e66a932471ffa3d914af052346b423117972aa957d0816f79492e657ccf1f356713f5959274d5f39573acde4d64e00a656ae999f0a30@65.108.127.87:30303?discport=30376", id: "9ede61e13d949a6ff325274262cf677d16093daf8be60c441707c8ba047526d3", name: "bor/v1.0.6/linux-amd64/go1.20.8", network: { inbound: false, localAddress: "172.18.35.78:47926", remoteAddress: "65.108.127.87:30303", static: false, trusted: false }, protocols: { eth: { version: 68 }, snap: { version: 1 } } }, { caps: ["eth/66", "eth/67", "eth/68", "snap/1"], enode: "enode://af51799ca42c94ff9db93aa933dad4d7ae5979153658df2a38f90c38654391f8a929c8d6af7cb04ea151f009a2b163d6458a71662d512adf1d300ea49107738f@5.9.87.183:30303?discport=30432", id: "a51dc5db9ffc3dbd5b5c67ed1925a486788b5e7668ca0c624b31468b4090f000", name: "bor/v1.0.6/linux-amd64/go1.20.8", network: { inbound: false, localAddress: "172.18.35.78:40188", remoteAddress: "5.9.87.183:30303", static: false, trusted: false }, protocols: { eth: { version: 68 }, snap: { version: 1 } } }, { caps: ["eth/66", "eth/67", "eth/68"], enode: "enode://76d2d6284ee5637113e3669e0fdff0fca83535e39ee0752b9338d9e306aad3f9b4db4c8e4e8738ad718c0f442daf96a37fc864d73954f931dd3c2b3d85663766@3.239.87.70:56304", id: "c0506599f03d41572ecbc8ea45b6eee0192c622eccd7d614d3bb9a3fb19e2548", name: "Geth/v1.1.8/linux-amd64/go1.20", network: { inbound: true, localAddress: "172.18.35.78:30303", remoteAddress: "3.239.87.70:56304", static: false, trusted: false }, protocols: { eth: { version: 68 } } }, { caps: ["eth/66", "eth/67", "eth/68"], enode: "enode://e6ddc59f7f585019b428a3a076a55a2ef1401926434f798b9fb29abb5502a6b33698bfba0420642132a959051f5e417af9abf6d67dc87d8e6f8e88acdbe1532b@54.90.91.58:34482", id: "d85b17d766b71531af5a5a57065ad2baef16f75df801e34ac3e446c9ea02470d", name: "Geth/v1.1.8/linux-amd64/go1.20", network: { inbound: true, localAddress: "172.18.35.78:30303", remoteAddress: "54.90.91.58:34482", static: false, trusted: false }, protocols: { eth: { version: 68 } } }]
Any idea how can we solve the issue?
Ι tried to apply https://forum.polygon.technology/t/recommended-peer-settings-for-mainnet-nodes/13018 [p2p.discovery] i will let you know if this resolves the issue
Above suggestions are not fixing the issue. Any other suggestion?
Above suggestions are not fixing the issue. Any other suggestion?
no luck. Tried a new physical machine with bor 1.1.0 and Heimdall 1.0.3 with snapshot data. All over again. Stuck randomly. The original one with weeks of manual restarts, finally went well for half month, not sure why, and afraid of unexpected stuck someday
@0xKrishna I think I might have hit the same problem on two nodes. The first node stop importing blocks ~8d the other around 2 hours ago.
Node Stopped 2 hours ago (Stopped 2024-01-16 @ 18:30:00 EST)
I have the pprof Goroutine dump for it, see pprof.geth.goroutine.polygon-mainnet-0.pb.gz. It seems to be blocked at https://github.com/maticnetwork/bor/blob/master/core/blockchain.go#L1888.
Node Stopped 8 days ago (Stopped 2024-01-09 @ 12:00:00 EST)
I have a pprof too, see pprof.geth.goroutine.polygon-mainnet-1.pb.gz. On this one I don't clearly see what is blocked. I don't even seems to see the blockchain import goroutine there, so not sure what it was doing.
For this dump, I have a bor attach of admin.nodeInfo and admin.peers, see pprof-polygon-mainnet-1-attach-nodeIndo-peer.txt.
Let me know if you need more info, I'll more closely follow the nodes to see if they get stuck again so I could gather extra data points.
Extra Details
I tried to stopped this node cleanly, sending a single SIGINT signal, then waited for 4 hours to stop cleanly but it never happened. I decided to force killing which means in this state, this stuck node never completed the clean shutdown sequence.
Same issue on two independent nodes, random block stuck with ERROR:
heimdalld[14653]: ERROR[2024-01-20|20:45:38.152] Span proposed is not in-turn module=bor currentChildBlock=52556670 msgStartblock=52563456 msgEndBlock=52569855
Hey @eldimious @VSGic @maoueh @GeassV ,
- We can ignore
unable to handle whitelist milestonelogs. We are working on suppressing these logs to DEBUG. - I can see your network is peered.
- Please downgrade your bor node to
v1.1.0and heimdall tov1.0.3. - Try to restart the clients.
- If the issue persists. Please attach a log dump ( or copy last 200 lines of log ) and configuration used to start the nodes.
Thank you! 💜
Hey @eldimious @VSGic @maoueh @GeassV ,
- We can ignore
unable to handle whitelist milestonelogs. We are working on suppressing these logs to DEBUG.- I can see your network is peered.
- Please downgrade your bor node to
v1.1.0and heimdall tov1.0.3.- Try to restart the clients.
- If the issue persists. Please attach a log dump ( or copy last 200 lines of log ) and configuration used to start the nodes.
Thank you! 💜
well, stuck at 52755409 and then moved to 52756404 and stuck again when trying to dump the log and config files bor version 1.1.0 and heimdall v1.0.3 attached are the log and config: output_24_1_26.log bor_config.txt
Hey @eldimious @VSGic @maoueh @GeassV ,
- We can ignore
unable to handle whitelist milestonelogs. We are working on suppressing these logs to DEBUG.- I can see your network is peered.
- Please downgrade your bor node to
v1.1.0and heimdall tov1.0.3.- Try to restart the clients.
- If the issue persists. Please attach a log dump ( or copy last 200 lines of log ) and configuration used to start the nodes.
Thank you! 💜
Hello, the same problem after downgrade. Regular restart needed attached log and config config_bor.txt out_bor.log
Hello,
I have the same issue. The bor node is stuck at block number 52962568.
bor v1.1.0
heimdall v1.0.3
I tried to restart the bor node, but it took a long time to try to stop.
Finally, it was killed by systemd for 'stop-sigterm' timed out.
After starting, the block number rolls back to 52921882, it far away from the stuck block number 52962568.
Same here.
@CaCaBlocker You can ignore these logs for now as your node is not completely synced.
@RyanWang0811 Is it working now?
It is working now. thx.
Hi, still have this problem, I restart bor 3-5 times per day
Still have this problem, too.
This issue is like what I posted previously and the issue seems not to have been repaired or still has any issue. https://github.com/maticnetwork/bor/issues/939
Is it a node bug? or any issue on the chain?
Problem still actual, two nodes with different bor versions struggle
Hey @RyanWang0811 @VSGic what specific errors are you facing currently ? Can you share some logs ?
Also have you upgraded to bor v1.2.3 ?
Hello @Raneet10 I have posted logs above here. I have two nodes, one with bor v1.2.3 , and it have the same problem
I encountered this problem using the latest version on the testnet, and there is no solution yet。heimdall:v1.0.4-beta,bor:v1.2.6-beta
Hello !
Just wanted to mention that we are experiencing the same issues with our 2 polygon bor nodes. I have setup a liveness probe (k8s) to restart the node if it get stuck for more than 15 min. It kinda work but it’s really annoying and we still manage to have small interruptions when both nodes get stuck at the same moment. It happens multiple times per day. It’s really bad.
Anything planned to fix those issues ?
By the way I compared the errors I got in Heimdall and bor logs while it was stuck on a block to the logs I had on the other node that was working. And I found exactly the same error in both. So the issue for sure is not being logged...
Hello, still have this trouble, we cannot send transactions with such node. They are get lost, when node out of sync. We work with polygon in manual regime
Hi, still actual, and become worse, one node even cannot get synced after reboot and stucks on the way again
Also faced this issue when bootstraping node from official snapshot. Seems that removing nodekey file fixed that problem and sync is now progressing.