planetary-ios icon indicating copy to clipboard operation
planetary-ios copied to clipboard

StartError - database corruption

Open mplorentz opened this issue 3 years ago • 6 comments

Sometimes go-ssb seems not shut down cleanly and when it reboots it cannot open its database. It looks like this for the user:

This is printed in the logs:

ts="2022-02-01 17:10:35.9673960 (UTC)" level=error event="bot init failed" err="BotInit: failed to make sbot instance: sbot: failed to open rootlog: failed to open log: offset2: integrity error: data file size difference -4042"

If you have experienced this error here are the steps you can take to recover your profile.

Bugsnag link

I traced this error message to the checkJournal() function in log.go. It sounds like the underlying marget log was not closed cleanly. What isn't clear to me is whether go-ssb is capable from restoring from this type of failure.

mplorentz avatar Feb 01 '22 18:02 mplorentz

Sebastian experienced this issue today and I was able to get his logs and database. The error presented differently in the UI but the underlying error from the GoBot is the same. The database is large so I won't upload it to Github, but I can provide it upon request. Here are the logs, and here is the Bugsnag issue.

mplorentz avatar Feb 03 '22 21:02 mplorentz

I'm marking this as blocked by #226. If this is still an issue we are seeing on the latest version of go-ssb, then I think we should have @boreq investigate at that point.

mplorentz avatar Feb 11 '22 17:02 mplorentz

There is some new code that fixes similar database errors in go-ssb:

https://github.com/cryptoscope/ssb/blob/master/cmd/go-sbot/main.go#L264

This code also exists in the older version of go-ssb but we don't seem to be using it.

Edit: actually we seem to call similar code from Swift.

boreq avatar Feb 14 '22 13:02 boreq

I just experienced this again on an old device I installed Planetary on years ago. Given that several people have seen this after launching Planetary after not using it for several months I'm starting to think this will be a problem for every Planetary user that fires up an identity they haven't touched for some time. My guess is that our current code is not interacting with an older database format correctly, as there is a lot of code scattered around for dealing with migrations.

This is probably something we want to prioritize before doing any marketing pushes that get old users to reopen Planetary. Even if we just give them the option to delete their database and resync from the network that would be better than what we have now, which is an infinite loop of pressing "Start Over" or "Try Again". CC @rabble @setch-l

mplorentz avatar Feb 15 '22 21:02 mplorentz

Also I tried calling a function I found called fsckAndRepair() which seemed really promising but failed without a useful error message.

mplorentz avatar Feb 15 '22 21:02 mplorentz

No update on this but we have prioritized #340 and #622 to make it easier to recover from these errors.

mplorentz avatar Jun 10 '22 18:06 mplorentz

I am closing this as scuttlego uses badger and not margaret.

boreq avatar Mar 01 '23 13:03 boreq