grin icon indicating copy to clipboard operation
grin copied to clipboard

Node will freeze sporadically after a few days to a week of running

Open jaw709 opened this issue 2 years ago • 3 comments

Description:

After successful installation and syncing of grin node, client runs without incident until consistently (multiple new installs) beginning to "freeze" after a few days to a week.

Install guide:

https://github.com/mimblewimble/grin/blob/master/doc/build.md

hardware:

Raspberry pi 4b 4GB RAM Installed on SD card apart from OS on USB

To Reproduce Steps to reproduce the behavior:

  1. Run '...' ./grin
  2. Expect: After a few days to a week to find terimnal hung
  3. See error: Logs attached

Relevant Information Replacing the "Main" folder from backup from first sync resolves for ~week

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: ubuntu 22.04
  • Version 5.2.0-alpha.1

Additional context I have tried the node on multiple mediums and file formats. Storing data on USB versus SD card, FAT32, Ext4, btrfs etc. The same occurs consistently; observing while running, I have not seen any resource problems with RAM or CPU. After replacing main folder, or restarting sometimes, it resumes as normal. Thank you.

grin-server-backup625.log

jaw709 avatar Jun 26 '22 03:06 jaw709

I believe I'm coming closer to isolating the pertinent error. I realized that the logs print in OS set time, while the node displays UTC. When I searched for the correct timestamp, it seems it could be related to hashsheet compaction. Please see attached. PXL_20220709_044929021 MP PXL_20220709_044138984 MP grin-server.log.2.gz grin-server-logs-all.zip

jaw709 avatar Jul 09 '22 05:07 jaw709

Just posting latest update... Froze again within one minute of tx hashset compaction. NOTE: node is four hours ahead with UTC time, logs prints in EST.

grin-server-714.log

PXL_20220714_181840956

jaw709 avatar Jul 14 '22 18:07 jaw709

Testing on Mainnet PIBD_impl branch intially did seem to work, however the freezing returned within one minute before compaction began. Logs attached aug22-SSandLogs.zip

jaw709 avatar Sep 22 '22 15:09 jaw709