visitor-flutter icon indicating copy to clipboard operation
visitor-flutter copied to clipboard

Immediately crashes on Raspberry Pi Zero W

Open interfect opened this issue 3 years ago • 9 comments

After building on a Raspberry Pi Zero W, ./go-ssb crashes immediately on startup with one of:

t=35.351562ms starting=metrics addr=localhost:6078
go-sbot: failed to instantiate ssb server: During db.vlog.open error: Open existing file: "/home/pi/.ssb-go/sublogs/shared-badger/000001.vlog" error: while opening file: /home/pi/.ssb-go/sublogs/shared-badger/000001.vlog error: cannot allocate memory
while mmapping /home/pi/.ssb-go/sublogs/shared-badger/000001.vlog with size: 2147483646
github.com/dgraph-io/ristretto/z.OpenMmapFileUsing
        /home/pi/go/pkg/mod/github.com/dgraph-io/[email protected]/z/file.go:59
github.com/dgraph-io/ristretto/z.OpenMmapFile
        /home/pi/go/pkg/mod/github.com/dgraph-io/[email protected]/z/file.go:86
github.com/dgraph-io/badger/v3.(*logFile).open
        /home/pi/go/pkg/mod/github.com/dgraph-io/badger/[email protected]/memtable.go:556
github.com/dgraph-io/badger/v3.(*valueLog).open
        /home/pi/go/pkg/mod/github.com/dgraph-io/badger/[email protected]/value.go:581
github.com/dgraph-io/badger/v3.Open
        /home/pi/go/pkg/mod/github.com/dgraph-io/badger/[email protected]/db.go:334
go.cryptoscope.co/ssb/repo.OpenBadgerDB
        /home/pi/go-ssb/repo/multilogs.go:39
go.cryptoscope.co/ssb/sbot.New
        /home/pi/go-ssb/sbot/new.go:291
main.runSbot
        /home/pi/go-ssb/cmd/go-sbot/main.go:218
main.main
        /home/pi/go-ssb/cmd/go-sbot/main.go:393
runtime.main
        /usr/local/go/src/runtime/proc.go:255
runtime.goexit
        /usr/local/go/src/runtime/asm_arm.s:838

Or:

t=14.659818ms starting=metrics addr=localhost:6078
go-sbot: failed to instantiate ssb server: while opening memtables error: while opening fid: 3 error: while updating skiplist error: mremap size mismatch: requested: 20 got: 134217728

Those are the entire logs for the relevant ./go-ssb invocations.

It looks like it's running into trouble mapping some memory. While virtual memory overcommit is enabled on the system, it's possible you're trying to map an overly large block of memory and telling the virtual memory system you actually need it in RAM right now.

Of note is the fact that the Raspberry Pi Zero W is a 512 MB memory board; here's the output of free -k:

free -k
              total        used        free      shared  buff/cache   available
Mem:         440356       35108      183820         884      221428      353268
Swap:        102396       41216       61180

So I have about 350 MB memory available to run go-ssb.

interfect avatar Oct 31 '21 02:10 interfect

Hey @interfect,

Can you try building with -tags nommio? I documented this in the known issues of the readme. Badger might be a bit too hungry in its default mode.

cryptix avatar Nov 03 '21 08:11 cryptix

I tried go build -v -tags nommio and go build -v -tags nommio ./cmd/go-sbot but those didn't seem to modify the go-sbot binary at all.

I did a go clean and go clean ./cmd/go-sbot ; go clean ./cmd/sbot-cli, but that didn't get rid of the binaries at the project root.

Than I did go build -v -tags nommio ./cmd/go-sbot which also didn't touch the binary, and go install -v -tags nommio ./cmd/go-sbot which said:

go.cryptoscope.co/ssb/repo
# go.cryptoscope.co/ssb/repo
repo/indexes.go:35:25: undefined: badgerOpts
repo/multilogs.go:38:10: undefined: badgerOpts

And didn't touch the binary.

Can you give the full command to build with that tag?

interfect avatar Nov 06 '21 16:11 interfect

This looks to be related:

https://discuss.dgraph.io/t/error-mremap-size-mismatch-on-arm64/15333

based on the "mremap size mismatch" error; please take a look at

https://github.com/dgraph-io/ristretto/pull/281

as one answer to the problem which looks like it's in BadgerDB.

vielmetti avatar Dec 09 '21 06:12 vielmetti

i've gotten this a few times on a Raspberry Pi. i fixed by deleting ./sublogs/shared-badger, but i assume that removes existing invites?

ahdinosaur avatar Jan 10 '22 06:01 ahdinosaur

I'm having this issue trying to run on an Orange Pi Zero. Deleting sublogs/shared-badger as shared by @ahdinosaur doesn't make it go away.

Also running this in a Raspberry 4 (arm64), and the fix works.

luandro avatar Mar 18 '22 15:03 luandro

I have a branch which fixes those build errors @interfect but the code wasn't submitted here yet as it is not fully tested. See the top 2 commits.

https://github.com/boreq/ssb/commits/no-fileio

boreq avatar Mar 22 '22 17:03 boreq

I keep getting on arm (Pi 3 & OrangePi Zero):

go-sbot: failed to instantiate ssb server: while opening memtables error: while opening fid: 1 error: while updating skiplist error: mremap size mismatch: requested: 20 got: 134217728

rm -rf /root/.ssb-go/sublogs/shared-badger only seems to work with arm64. Is your fix worth a spin for arm @boreq ?

luandro avatar Mar 24 '22 21:03 luandro

@luandro I have no idea, my fix basically makes the whole thing compile with an old build tag which was supposed to be used on mobile devices and switch memory mapping modes to something else. Preserving the intention behind that build tag I fixed the compilation and changed the options to reduce memory usage even though that mode is not supported anymore. Those options are now gone from badger. Therefore I think my fix is only tangentially related.

boreq avatar Mar 29 '22 14:03 boreq

We are experiencing the same issue on an RPi 3:

https://git.coopcloud.tech/PeachCloud/peach-workspace/issues/134#issuecomment-13997

mycognosist avatar Sep 08 '22 07:09 mycognosist

I hope that pulling in fixes in the badgerDB department via https://github.com/ssbc/go-ssb/pull/176 might help here. I was having a look in https://github.com/dgraph-io/badger/releases. I'm still working on #176 but it should come together shortly. This is a bit of a long shot though.

In https://github.com/planetary-social/planetary-ios/pull/533 @mplorentz mentions:

https://github.com/planetary-social/planetary-ios/issues/459 Lowers the memory usage to acceptable levels. This was done by adding some new options to go-ssb to limit the number of feeds we replicate concurrently, and by limited the number of messages we fetch from a given feed to 1000 at a time.

And I think I've managed to dig those up on https://github.com/ssbc/go-ssb/compare/master...planetary-social:ssb:fork:

  • https://github.com/ssbc/go-ssb/commit/7f65e31e4a4c05b09f80eca6d794a1a8aa0afa9f
  • https://github.com/ssbc/go-ssb/commit/83270cf85ed3c56dc546aacd5d75b14d740fad9b
  • https://github.com/planetary-social/ssb/commit/7f471b8512c3299640bcd5376b106576f7d5003e

(edit: and potentially quite a rich search space for fixes in https://discuss.dgraph.io/c/issues/badger/37)

There could be more fixes lying around on this, so please let me know @boreq @mplorentz if you have time.

If you're interested to test @mycognosist @interfect @luandro @vielmetti @ahdinosaur I could patch these fixes in and add an additional -lite flag (or whatever) to go-sbot for triggering this logic? You could test & report back and we see where it goes?

decentral1se avatar Oct 25 '22 16:10 decentral1se

@decentral1se

Great work!

I could patch these fixes in and add an additional -lite flag (or whatever) to go-sbot for triggering this logic? You could test & report back and we see where it goes?

That sounds terrific. I have some travel coming up in November but I'm excited to help with testing and reporting (I'll probably be settled in a new place near the end of November). Between several testers we can probably cover Pi 0, 3 and 4.

mycognosist avatar Oct 26 '22 10:10 mycognosist

:running_woman: https://github.com/ssbc/go-ssb/pull/180 :running_woman:

decentral1se avatar Oct 27 '22 23:10 decentral1se

Quick update from my testing on Orange Pi Zero:

I have not seen any memory crashes on startup. Once it's up-and-running it seems stable, even after 1h30m. I have, however, seen some errors when I shutdown the process (Ctrl+c):

level=warn t=12m39.797272388s event=killed msg="received signal, shutting down" signal=interrupt
level=warn t=12m39.798054875s unit=gossip event="live qry on rxlog exited"
level=error t=12m41.798690257s conn="ssb-ws :8998 listen exited" err="accept tcp [::]:8987: use of closed network connection"
level=debug t=12m41.798912212s event="sbot closing" msg="connections closed"
level=error t=12m41.799388412s event="fatal error" err="sbot: index group shutdown failed: sbot index(contacts) update of backlog failed: margaret: Entry Nulled"
t=12m41.807593941s event=panic location=checkAndLog panicLog=panics/checkAndLog1848513752 err="sbot: index group shutdown failed: sbot index(contacts) update of backlog failed: margaret: Entry Nulled"

Then on the next startup attempts:

go-sbot: failed to instantiate ssb server: while opening memtables error: while opening fid: 1 error: while updating skiplist error: mremap size mismatch: requested: 20 got: 67108864
go-sbot: failed to instantiate ssb server: During db.vlog.open error: while truncating last value log file: /home/pihole/.ssb-go/sublogs/shared-badger/000006.vlog error: mremap size mismatch: requested: 20 got: 67108864
go-sbot: failed to instantiate ssb server: while opening memtables error: while opening fid: 2 error: while updating skiplist error: mremap size mismatch: requested: 20 got: 67108864

Deleting the shared-badger sublogs directory solves the problem.

mycognosist avatar Oct 31 '22 12:10 mycognosist

The hope is that https://github.com/ssbc/go-ssb/pull/180 resolved this. However, we're seeing that there are still some issues on 32 bit systems. This has been documented in https://github.com/ssbc/go-ssb/blob/master/docs/faq.md#what-platforms-does-go-ssb-support. Testing around this issue / crashing is going on https://github.com/ssbc/go-ssb/issues/183, I suggest we continue there.

decentral1se avatar Nov 07 '22 17:11 decentral1se