node
node copied to clipboard
linux/arm64 compatibility seems broken
Hello 👋
I have recently tried to run akash binary on linux/arm64 architecture (version v.0.14.1) and was facing panic issues when running the binary.
You can find below the steps to reproduces and my humble investigation.
How to reproduce ?
docker run -ti --rm --platform linux/arm64 ubuntu
# In container
apt-get update && apt-get install curl wget unzip
wget https://github.com/ovrclk/akash/releases/download/v0.14.1/akash_0.14.1_linux_arm64.zip && unzip akash_0.14.1_linux_arm64.zip
curl -s "https://raw.githubusercontent.com/ovrclk/net/master/mainnet/genesis.json" > $HOME/.akash/config/genesis.json
./akash_0.14.1_linux_arm64/akash start
Note: I ran the docker command from a 2020 M1 macbook, on Monterey. With docker desktop you can use multi architecture support with --platform, which uses qemu on the background to run other arch than your host. (linux/arm64 instead of darwin/arm64 here).
I first had the panic issue from an AWS graviton instance, which is linux/arm64, and then reproduced it locally thanks to docker multi-arch support.
Output:
7:47AM INF starting ABCI with Tendermint
7:47AM INF Starting multiAppConn service impl=multiAppConn module=proxy
7:47AM INF Starting localClient service connection=query impl=localClient module=abci-client
7:47AM INF Starting localClient service connection=snapshot impl=localClient module=abci-client
7:47AM INF Starting localClient service connection=mempool impl=localClient module=abci-client
7:47AM INF Starting localClient service connection=consensus impl=localClient module=abci-client
7:47AM INF Starting EventBus service impl=EventBus module=events
7:47AM INF Starting PubSub service impl=PubSub module=pubsub
unexpected fault address 0x5f6c61697486b4
fatal error: fault
[signal SIGSEGV: segmentation violation code=0x1 addr=0x5f6c61697486b4 pc=0xb70550]
goroutine 82 [running]:
runtime.throw(0x24de26a, 0x5)
runtime/panic.go:1117 +0x54 fp=0x4000f49590 sp=0x4000f49560 pc=0x434c14
runtime.sigpanic()
runtime/signal_unix.go:741 +0x230 fp=0x4000f495d0 sp=0x4000f49590 pc=0x44c7c0
github.com/golang/snappy.encodeBlock(0x4007af8004, 0xcba544, 0xcba544, 0x40048e2000, 0x10000, 0xae8dec, 0x4000078701)
github.com/golang/[email protected]/encode_arm64.s:666 +0x360 fp=0x4000f51670 sp=0x4000f495e0 pc=0xb70550
github.com/golang/snappy.Encode(0x4007af8000, 0xcba548, 0xcba548, 0x40048f2000, 0xad8d8c, 0xad8dec, 0x4000667ce0, 0x4000667ce0, 0x4000078788)
github.com/golang/[email protected]/encode.go:39 +0x17c fp=0x4000f516c0 sp=0x4000f51670 pc=0xb6fa8c
github.com/syndtr/goleveldb/leveldb/table.(*Writer).writeBlock(0x4000f1a6c0, 0x4000f1a718, 0x2, 0x0, 0x12, 0x18, 0x4000667ce0)
github.com/syndtr/[email protected]/leveldb/table/writer.go:171 +0xb0 fp=0x4000f51740 sp=0x4000f516c0 pc=0xb78270
github.com/syndtr/goleveldb/leveldb/table.(*Writer).finishBlock(0x4000f1a6c0, 0x4003df8000, 0x12)
github.com/syndtr/[email protected]/leveldb/table/writer.go:222 +0x4c fp=0x4000f51790 sp=0x4000f51740 pc=0xb7870c
github.com/syndtr/goleveldb/leveldb/table.(*Writer).Append(0x4000f1a6c0, 0x4003df8000, 0x12, 0xaea000, 0x4003df8012, 0xae8d6c, 0xae9fee, 0x18, 0x4000f1a6c0)
github.com/syndtr/[email protected]/leveldb/table/writer.go:255 +0x1e4 fp=0x4000f51800 sp=0x4000f51790 pc=0xb78974
github.com/syndtr/goleveldb/leveldb.(*tWriter).append(0x40001a33e0, 0x4003df8000, 0x12, 0xaea000, 0x4003df8012, 0xae8d6c, 0xae9fee, 0x4000cfc700, 0x400006e800)
github.com/syndtr/[email protected]/leveldb/table.go:559 +0xbc fp=0x4000f51870 sp=0x4000f51800 pc=0xb9a4dc
github.com/syndtr/goleveldb/leveldb.(*tOps).createFrom(0x4001161290, 0x2ae2fb8, 0x4000cfc700, 0x0, 0x0, 0x0, 0x0)
github.com/syndtr/[email protected]/leveldb/table.go:397 +0x11c fp=0x4000f51910 sp=0x4000f51870 pc=0xb9985c
github.com/syndtr/goleveldb/leveldb.(*session).flushMemdb(0x4000c96ff0, 0x4000d2edc0, 0x4000168a80, 0x0, 0x0, 0x0, 0x0)
github.com/syndtr/[email protected]/leveldb/session_compaction.go:35 +0xa4 fp=0x4000f51a70 sp=0x4000f51910 pc=0xb91ad4
github.com/syndtr/goleveldb/leveldb.(*DB).memCompaction.func1(0x4000124358, 0x20d5b80, 0x40001ba101)
github.com/syndtr/[email protected]/leveldb/db_compaction.go:305 +0x64 fp=0x4000f51af0 sp=0x4000f51a70 pc=0xb9e9a4
github.com/syndtr/goleveldb/leveldb.(*compactionTransactFunc).run(0x4000d68e80, 0x4000124358, 0x0, 0xbff)
github.com/syndtr/[email protected]/leveldb/db_compaction.go:242 +0x34 fp=0x4000f51b20 sp=0x4000f51af0 pc=0xb82724
github.com/syndtr/goleveldb/leveldb.(*DB).compactionTransact(0x4000c34540, 0x24e7ca0, 0xb, 0x2a80ed8, 0x4000d68e80)
github.com/syndtr/[email protected]/leveldb/db_compaction.go:186 +0x1d0 fp=0x4000f51d50 sp=0x4000f51b20 pc=0xb82040
github.com/syndtr/goleveldb/leveldb.(*DB).compactionTransactFunc(...)
github.com/syndtr/[email protected]/leveldb/db_compaction.go:253
github.com/syndtr/goleveldb/leveldb.(*DB).memCompaction(0x4000c34540)
github.com/syndtr/[email protected]/leveldb/db_compaction.go:303 +0x324 fp=0x4000f51f10 sp=0x4000f51d50 pc=0xb82ce4
github.com/syndtr/goleveldb/leveldb.(*DB).mCompaction(0x4000c34540)
github.com/syndtr/[email protected]/leveldb/db_compaction.go:777 +0x64 fp=0x4000f51fd0 sp=0x4000f51f10 pc=0xb85e14
runtime.goexit()
runtime/asm_arm64.s:1130 +0x4 fp=0x4000f51fd0 sp=0x4000f51fd0 pc=0x46c0c4
created by github.com/syndtr/goleveldb/leveldb.openDB
github.com/syndtr/[email protected]/leveldb/db.go:156 +0x464
goroutine 1 [runnable]:
syscall.Syscall6(0x4f, 0xffffffffffffff9c, 0x400056a810, 0x4000f0a038, 0x100, 0x0, 0x0, 0xffffffffffffffff, 0x0, 0x2)
syscall/asm_linux_arm64.s:35 +0x10
syscall.Fstatat(0xffffffffffffff9c, 0x4000f72540, 0x25, 0x4000f0a038, 0x100, 0x0, 0x0)
syscall/zsyscall_linux_arm64.go:1093 +0xa8
syscall.Lstat(...)
syscall/syscall_linux_arm64.go:58
os.lstatNolog.func1(...)
os/stat_unix.go:45
os.ignoringEINTR(...)
os/file_posix.go:245
os.lstatNolog(0x4000f72540, 0x25, 0x0, 0x0, 0x0, 0x1a4)
os/stat_unix.go:44 +0x70
os.Lstat(0x4000f72540, 0x25, 0x4000699dd8, 0xb59cbc, 0x40001b2e40, 0x0)
os/stat.go:22 +0x44
os.rename(0x4000f72630, 0x27, 0x4000f72540, 0x25, 0x20, 0x1a4)
os/file_unix.go:22 +0x30
os.Rename(...)
os/file.go:348
github.com/syndtr/goleveldb/leveldb/storage.rename(...)
github.com/syndtr/[email protected]/leveldb/storage/file_storage_unix.go:63
github.com/syndtr/goleveldb/leveldb/storage.(*fileStorage).setMeta(0x40001b6380, 0x1, 0x0, 0xb97550, 0x4000699fc8)
github.com/syndtr/[email protected]/leveldb/storage/file_storage.go:267 +0x33c
github.com/syndtr/goleveldb/leveldb/storage.(*fileStorage).SetMeta(0x40001b6380, 0x1, 0x0, 0x0, 0x0)
github.com/syndtr/[email protected]/leveldb/storage/file_storage.go:292 +0xf4
github.com/syndtr/goleveldb/leveldb.(*session).newManifest(0x4000c973b0, 0x40001bc140, 0x0, 0x0, 0x0)
github.com/syndtr/[email protected]/leveldb/session_util.go:456 +0x4d0
github.com/syndtr/goleveldb/leveldb.(*session).create(...)
github.com/syndtr/[email protected]/leveldb/session.go:125
github.com/syndtr/goleveldb/leveldb.Open(0x2adfb20, 0x40001b6380, 0x0, 0x0, 0x2a5a4a8, 0x4000118030)
github.com/syndtr/[email protected]/leveldb/db.go:194 +0x1d8
github.com/syndtr/goleveldb/leveldb.OpenFile(0x40001993e0, 0x1d, 0x0, 0x40001993e0, 0x1d, 0x400114d420)
github.com/syndtr/[email protected]/leveldb/db.go:225 +0x7c
github.com/tendermint/tm-db.NewGoLevelDBWithOpts(0x24e359c, 0x8, 0x400114e1c8, 0x11, 0x0, 0x10, 0x4000092b60, 0xd)
github.com/tendermint/[email protected]/goleveldb.go:32 +0xa4
github.com/tendermint/tm-db.NewGoLevelDB(...)
github.com/tendermint/[email protected]/goleveldb.go:27
github.com/tendermint/tm-db.init.0.func1(0x24e359c, 0x8, 0x400114e1c8, 0x11, 0x4000092be8, 0x400114e101, 0x11, 0x0)
github.com/tendermint/[email protected]/goleveldb.go:15 +0x44
github.com/tendermint/tm-db.NewDB(0x24e359c, 0x8, 0x4000cc25d0, 0x9, 0x400114e1c8, 0x11, 0x4000000180, 0xa219fc, 0x400069a458, 0xa1f9ec)
github.com/tendermint/[email protected]/db.go:64 +0x2b4
github.com/tendermint/tendermint/node.DefaultDBProvider(0x4000eed0c8, 0x4000eed0c8, 0x2, 0x2, 0x19)
github.com/tendermint/[email protected]/node/node.go:69 +0xb0
github.com/tendermint/tendermint/node.createAndStartIndexerService(0x400068b180, 0x27e2fa8, 0x4000f0c230, 0x2abe390, 0x4001131080, 0x4000674b60, 0x0, 0x0, 0xa, 0x1, ...)
github.com/tendermint/[email protected]/node/node.go:259 +0x30c
github.com/tendermint/tendermint/node.NewNode(0x400068b180, 0x2aaaa30, 0x4000d2f4a0, 0x4000540720, 0x2a5be08, 0x4000eecb10, 0x40005407b0, 0x27e2fa8, 0x40005408c0, 0x2abe390, ...)
github.com/tendermint/[email protected]/node/node.go:669 +0x1d8
github.com/cosmos/cosmos-sdk/server.startInProcess(0x4000e4b720, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x2accaf0, 0x4000edc350, ...)
github.com/cosmos/[email protected]/server/start.go:244 +0x3f4
github.com/cosmos/cosmos-sdk/server.StartCmd.func2(0x4000d2bb80, 0x3da6ae0, 0x0, 0x0, 0x0, 0x0)
github.com/cosmos/[email protected]/server/start.go:120 +0x144
github.com/spf13/cobra.(*Command).execute(0x4000d2bb80, 0x3da6ae0, 0x0, 0x0, 0x4000d2bb80, 0x3da6ae0)
github.com/spf13/[email protected]/command.go:850 +0x320
github.com/spf13/cobra.(*Command).ExecuteC(0x40001e62c0, 0x24ddab8, 0x5, 0x4000e7f470)
github.com/spf13/[email protected]/command.go:958 +0x268
github.com/spf13/cobra.(*Command).Execute(...)
github.com/spf13/[email protected]/command.go:895
github.com/spf13/cobra.(*Command).ExecuteContext(...)
github.com/spf13/[email protected]/command.go:888
github.com/ovrclk/akash/cmd/akash/cmd.Execute(0x40001e62c0, 0x2ad2348, 0x400000e420)
github.com/ovrclk/akash/cmd/akash/cmd/root.go:117 +0x230
main.main()
github.com/ovrclk/akash/cmd/akash/main.go:14 +0x24
[REDACTED]
In the exact same container on amd64 architecture (--platform=amd64), the issue is non-existent.
This highly similar to #1206, which was running version v0.12.1
Investigation
The issues seems to arise from the golang/snappy dependency.
After digging a bit, I stumbled upon this assembly issue on snappy. Issue is known and should be fixed by using snappy > v0.0.3
Snappy is actually used by tendermint db package. Dependency graph:
akash v0.14.1 -> github.com/tendermint/tm-db v0.6.4 -> github.com/syndtr/goleveldb v1.0.1-0.20200815110645-5c35d600f0ca -> github.com/golang/snappy v0.0.1
Note that github.com/tendermint/tendermint also uses github.com/tendermint/tm-db
Conclusion
I hope that the investigation is correct.
I will probably raise an issue on github.com/tendermint/tm-db side, to see if it’s possible to update the snappy dependency, since master is still using v0.0.1
But I am raising it here also because akash README.md states that binary is compatible with linux/arm64, which is not true at the moment.
we encountered this with gaia and the fix was upstream:
https://github.com/cosmos/gaia/issues/862
I can't see it in your post, can you please tell us the hardware you are running on? ARM is a very large family.
we encountered this with gaia and the fix was upstream:
Thanks for the insight ! Did not know about that. That makes it relevant to have it fixed on tm-db side directly though.
I can't see it in your post, can you please tell us the hardware you are running on? ARM is a very large family.
My bad. I have added the following note on the issue description:
Note: I ran the docker command from a 2020 M1 macbook, on Monterey. With docker desktop you can use multi architecture support with --platform, which uses qemu on the background to run other arch than your host. (linux/arm64 instead of darwin/arm64 here). I first had the panic issue from an AWS graviton instance, which is linux/arm64, and then reproduced it locally thanks to docker multi-arch support.
I had the same issue on on AWS t4g.large which is a linux/arm64 graviton instance using v0.14.0
Appreciate all the feedback on this, we'll have to block off some time to perform our own tests on different ARM platforms in the future.