grin
grin copied to clipboard
Failed to find one of the right cookies. Core dumped
Grin was running when my server was rebooted. After that any attempt to start grin failed with the message above. Removal of chain_data
helped.
I found it's croaring https://github.com/RoaringBitmap/CRoaring/blob/master/src/roaring_array.c#L767
We corrupted one of the pmmr_leaf.bin
or pmmr_prun.bin
files somehow?
We were writing to one of those when the server rebooted?
Oof.
was this before or after antioch's fix for safely writing files before stopping grin?
My fix won't help with a server reboot, only on "clean" shutdown via the grin node itself.
ok. is core dumping risking spilling secrets to disk? should we catch this, show a warning, then either shut-down gracefully, or somehow retry?
Saw another report in gitter. I predict it may be a problem when we get enough nodes.
Good prediction. One issue I think is that on other storages (LMDB, MMRs), we have a way to get back to a previous snapshot (the chain head) so if a write didn't really work out we can easily find a workable checkpoint. It doesn't seem as easy with croaring but might not be too hard to add?
Just so there is no confusion here - there is zero persistence in the croaring library, all of this is on us, we literally just write the bytes to a file. I believe we use a temp file to make things reasonably atomic but we have not put a lot of thought into doing this really robustly.
Memory-mapped file could help, unfortunately it's not yet supported by croaring (but supported by java and go versions) https://github.com/RoaringBitmap/CRoaring/issues/74
Was there ever a resolution to this? I just ran an Nvidia graphics driver update on my PC. Somewhere during the update, it crashed my Virtual Box machine, which was running my node. Now, I am unable to restart my node..
Is there a workaround to get this back up and running, or should I destroy this machine, make a new one, and import the old wallet?
@Jimmy24651 sure, the workaround is in the issue text, rm -rf ~/.grin/main/chain_data
(replace main
with floo
for floonet)
@hashmap is this an issue anymore?
@kargakis I think it's still an issue, a server could be stopped abruptly by power outage or the process be killed by kill -9
etc
Issue still exists on grin 2.0.0
I have a node installed on my mining rig because of solo mining and after every third power outage I have to do rm -rf ~/.grin/main/chain_data
and download the whole blockchain again. Very annoying bug.
Issue still exists on grin 3.0. The power went out on my computer and I recieved the following error when I tried to restart grin 'I failed to find one of the right cookies. Found 3497651248 Segmentation fault (core dumped)' Is there a fix for this? I tried 'rm -rf ~/.grin/main/chain_data' and i still get the same error.