hornet
hornet copied to clipboard
after a time, CMI is progressively lagging, until hornet is restarted
Describe the bug
CMI started lagging after about 42h from hornet node starting up. after another 24h CMI is ~15,000 MS behind after restarting hornet the node is back in sync in under 3m
To Reproduce Steps to reproduce the behavior:
- start hornet / sync db
- after a while CMI starts lagging behind LMI
Expected behavior node to remain in sync
Environment information:
- OS: [Ubuntu 22.04]
- RAM: [8 GB]
- Cores: [4 vCores]
- Type: [VM]
- HORNET version: [e3fe9e70 (HEAD -> develop, origin/develop, origin/HEAD) - 2.0.0-beta.2]
Additional context
node is started and synced from index 58400:
2022-07-21T13:16:40-07:00 INFO Snapshot creating delta snapshot for targetIndex 60064
2022-07-21T13:16:40-07:00 INFO WarpSync Synchronized 1714 milestones in 40.811s (2398.83 BPS)
CMI/LMI start drifting here:
2022-07-23T07:56:47-07:00 INFO Tangle Valid milestone detected! Index: 90836
req(qu/pe/proc/lat): 00000/00000/00000/0000ms, reqQMs: 0, processor: 00000, CMI/LMI: 90835/90836, BPS (in/new/out): 00029/00009/00018, Tips (non-/semi-lazy): 57/0
after ~24h CMI is > 15,000 MS behind:
2022-07-24T07:56:47-07:00 INFO Tangle Valid milestone detected! Index: 108116
2022-07-24T07:56:47-07:00 INFO WarpSync Checkpoint updated to milestone 108116 (target 108116)
2022-07-24T07:56:47-07:00 INFO WarpSync Target updated to milestone 108116 (checkpoint 108116)
req(qu/pe/proc/lat): 00000/00016/00000/0000ms, reqQMs: 0, processor: 00000, CMI/LMI: 93021/108116, BPS (in/new/out): 00048/00000/00000, Tips (non-/semi-lazy): 26/0
node restarted and re-synced:
2022-07-24T09:57:15-07:00 INFO Tangle Run solidity check for Milestone (109561)...
2022-07-24T09:57:15-07:00 INFO Tangle Solidifier finished: blocks: 0, collect: 0s, solidity 0s, propagation: 0s, total: 0s
2022-07-24T09:57:15-07:00 INFO Snapshot creating delta snapshot for targetIndex 109511
2022-07-24T09:57:15-07:00 INFO WarpSync Synchronized 16383 milestones in 2m21.186s (6576.78 BPS)
same node and issue:
node was restarted and synced - CMI/LMI: 126250/126250
2022-07-25T09:08:00-07:00 INFO WarpSync Synchronized 2580 milestones in 14.191s (10290.63 BPS)
CMI started falling behind - 2022-07-26T06:06:47-07:00:
2022-07-26T06:06:47-07:00 INFO Tangle Valid milestone detected! Index: 141356
req(qu/pe/proc/lat): 00000/00000/00000/0000ms, reqQMs: 0, processor: 00000, CMI/LMI: 141355/141356, BPS (in/new/out): 00027/00009/00018, Tips (non-/semi-lazy): 53/0
after about 3h, CMI was almost 2,000 MS behind
2022-07-26T09:06:07-07:00 INFO Tangle Valid milestone detected! Index: 143508
2022-07-26T09:06:07-07:00 INFO WarpSync Checkpoint updated to milestone 143508 (target 143508)
2022-07-26T09:06:07-07:00 INFO WarpSync Target updated to milestone 143508 (checkpoint 143508)
req(qu/pe/proc/lat): 00000/00018/00000/0000ms, reqQMs: 0, processor: 00000, CMI/LMI: 141585/143508, BPS (in/new/out): 00049/00010/00000, Tips (non-/semi-lazy): 20/0
pprof and fresh log attached
pprof.hornet.alloc_objects.alloc_space.inuse_objects.inuse_space.001.pb.gz shimmer-1-jul-26.zip full goroutine dump.zip memory_profile_shimmer_beta-2.pdf
On the first view it looks like the hard drive is slow. What hardware are you using?
SSD. i think the problem was running the inx-chronicle plugin. Stopped inx-chronicle the node synced up right away
@luca-yvr We had the same problem on the Testnet
but there were several major improvements in Chronicle since then. could you please retry with chronicle 1.0.0-beta.8
?
Seems the issue was not on HORNET side. Closing this for now.