murmur3 icon indicating copy to clipboard operation
murmur3 copied to clipboard

Flaky "unsafe pointer conversion"-panics with -race and go1.14rc1

Open imsodin opened this issue 4 years ago • 15 comments

I cannot reproduce the panic below with a minimal example and it occurs very often, but neither all the time nor in the same test - so this report might be entirely unhelpful, but I am reporting it anyway in case it makes any sense to you:

When running the Syncthing database testsuite on go1.14rc1 with race detection enabled on linux/amd64 I get the following panic:

$ go test -v -race -short -count 1 github.com/syncthing/syncthing/lib/db/ 2>&1 | tee test.out
fatal error: checkptr: unsafe pointer conversion

goroutine 148 [running]:
runtime.throw(0xd106c0, 0x23)
	/media/ext4_data/Linux/source/go/src/runtime/panic.go:1112 +0x72 fp=0xc00022ba50 sp=0xc00022ba20 pc=0x4630d2
runtime.checkptrAlignment(0xc00011e3f1, 0xc40880, 0x1)
	/media/ext4_data/Linux/source/go/src/runtime/checkptr.go:13 +0xd0 fp=0xc00022ba80 sp=0xc00022ba50 pc=0x4348a0
github.com/spaolacci/murmur3.(*digest128).bmix(0xc000123c80, 0xc00011e3f1, 0x20, 0x2f, 0x0, 0x48, 0x58)
	/media/ext4_data/Coding/go/pkg/mod/github.com/spaolacci/[email protected]/murmur128.go:67 +0xc0 fp=0xc00022bae8 sp=0xc00022ba80 pc=0xb649f0
github.com/spaolacci/murmur3.(*digest).Write(0xc000123c80, 0xc00011e3f1, 0x20, 0x2f, 0xb64474, 0xc000123c80, 0xc000123c80)
	/media/ext4_data/Coding/go/pkg/mod/github.com/spaolacci/[email protected]/murmur.go:51 +0x127 fp=0xc00022bb98 sp=0xc00022bae8 pc=0xb63dd7
github.com/spaolacci/murmur3.(*digest128).Write(0xc000123c80, 0xc00011e3f1, 0x20, 0x2f, 0x10, 0x10, 0x21)
	<autogenerated>:1 +0x6a fp=0xc00022bbf8 sp=0xc00022bb98 pc=0xb6558a
github.com/willf/bloom.baseHashes(0xc00011e3f1, 0x20, 0x2f, 0x0, 0x0, 0x0, 0x0)
	/media/ext4_data/Coding/go/pkg/mod/github.com/willf/[email protected]+incompatible/bloom.go:97 +0xb9 fp=0xc00022bc68 sp=0xc00022bbf8 pc=0xb6d179
github.com/willf/bloom.(*BloomFilter).Test(0xc000134700, 0xc00011e3f1, 0x20, 0x2f, 0xdf6fe0)
	/media/ext4_data/Coding/go/pkg/mod/github.com/willf/[email protected]+incompatible/bloom.go:183 +0x6b fp=0xc00022bd20 sp=0xc00022bc68 pc=0xb6deab
github.com/syncthing/syncthing/lib/db.(*Lowlevel).gcBlocks(0xc000098870, 0x0, 0x0)
	/media/ext4_data/Coding/go/src/github.com/syncthing/syncthing/lib/db/lowlevel.go:570 +0x4d4 fp=0xc00022beb8 sp=0xc00022bd20 pc=0xb798b4
github.com/syncthing/syncthing/lib/db.(*Lowlevel).gcRunner(0xc000098870)
	/media/ext4_data/Coding/go/src/github.com/syncthing/syncthing/lib/db/lowlevel.go:483 +0x279 fp=0xc00022bfd8 sp=0xc00022beb8 pc=0xb78fb9
runtime.goexit()
	/media/ext4_data/Linux/source/go/src/runtime/asm_amd64.s:1375 +0x1 fp=0xc00022bfe0 sp=0xc00022bfd8 pc=0x4969c1
created by github.com/syncthing/syncthing/lib/db.NewLowlevel
	/media/ext4_data/Coding/go/src/github.com/syncthing/syncthing/lib/db/lowlevel.go:65 +0x36a

(rest of the backtrace does not include murmur3 nor bloom).

Do you have any ideas what this might be about? Thanks in advance for any help.

imsodin avatar Feb 10 '20 21:02 imsodin

I suspect the problem is that there is no guarantee that the alignment is correct after the pointer conversion here:

https://github.com/spaolacci/murmur3/blob/539464a789e9b9f01bc857458ffe2c5c1a2ed382/murmur128.go#L68

calmh avatar Feb 26 '20 15:02 calmh

This test reproduces it reliably:

func TestUnalignedWrite(t *testing.T) {
	b := make([]byte, 128)
	for i := 0; i < 16; i++ {
		Sum128(b[i:])
	}
}

(Also one of the existing string tests)

calmh avatar Feb 26 '20 15:02 calmh

Sigh. go1.14 is officially out (yay), and checkptr validation is now a real thing :D

~@calmh thank you for taking the time to work on this, I stole your test to try and solve this the way go-cmp did, with #31 being the result.~

~Benchmark results seem a bit better than using encoding/binary but not sure if it's enough to worth keeping unsafe around there.~

Sigh. Sorry it's getting late, that was completely wrong, let me try again. :D

Converting a Pointer to a uintptr produces the memory address of the value pointed at, as an integer. The usual use for such a uintptr is to print it.

Conversion of a uintptr back to Pointer is not valid in general.

A uintptr is an integer, not a reference. Converting a Pointer to a uintptr creates an integer value with no pointer semantics. Even if a uintptr holds the address of some object, the garbage collector will not update that uintptr’s value if the object moves, nor will that uintptr keep the object from being reclaimed.

If p points into an allocated object, it can be advanced through the object by conversion to uintptr, addition of an offset, and conversion back to Pointer.

I was going for something like this but checkptr still doesn't like it.

        nblocks := len(p) / 4
        for i := 0; i < nblocks; i++ {
-               k1 := *(*uint32)(unsafe.Pointer(&p[i*4]))
+               k1 := *(*uint32)(unsafe.Pointer(uintptr(unsafe.Pointer(&p[0])) + uintptr(i*4)*unsafe.Sizeof(&p[0])))

geoah avatar Feb 26 '20 22:02 geoah

The check enforces this:

When converting unsafe.Pointer to *T, the resulting pointer must be aligned appropriately for T.

Any given byte is not going to be properly aligned for uint32 or uint64 so we can’t assume that we can do that conversion unfortunately.

calmh avatar Feb 27 '20 05:02 calmh

Fundamentally, the unsafe code needs changing. Feel free to take my patches: https://github.com/twmb/murmur3/compare/b0c3ada...HEAD

Note that using binary.LittleEndian (or my path) will resolve this repos tickets around issues across architectures, as well.

twmb avatar Feb 28 '20 04:02 twmb

That patch looks better than mine, which was just a quick fix. We might end up depending on your repo instead, @twmb. :)

calmh avatar Feb 28 '20 06:02 calmh

not merged. So, I've switched my code to github.com/DataDog/mmh3

mangup avatar Apr 01 '20 18:04 mangup

The fork of @twmb seems quite nice: https://github.com/twmb/murmur3

oliverpool avatar Apr 01 '20 18:04 oliverpool

FWIW DataDog/mmh3 theoretically same unaligned input problem, but I think it goes through unsafe & reflect.SliceHeader in a roundabout enough way that the new checks don't catch it:

        h := *(*reflect.SliceHeader)(unsafe.Pointer(&key))
        h.Len = nblocks * 2
        b := *(*[]uint64)(unsafe.Pointer(&h))

twmb avatar Apr 02 '20 20:04 twmb

seems commit 229247d33b has already solved this problem, maybe we can close this issue?

kumakichi avatar Jul 02 '20 14:07 kumakichi

Likely, but for what it's worth only the warning was removed, the underlying issue the warning was about still exists. Given that nobody complained about the issue in years, it may not a large issue.

"The main benefit of the check is to catch people making alignment assumptions on x86, where their program would then fail on another architecture. That just doesn't seem that much of a pain point to me. Even if it fails on another architecture, at least it does so loudly, without silently corrupting data"

twmb avatar Jul 02 '20 15:07 twmb

I am facing similar issue when migrating to go1.14.x

fatal error: checkptr: unsafe pointer arithmetic

goroutine 26 [running]: runtime.throw(0xecd19c, 0x23) /usr/local/go/src/runtime/panic.go:1116 +0x72 fp=0xc000054d68 sp=0xc000054d38 pc=0x467402 runtime.checkptrArithmetic(0xc000054e38, 0x0, 0x0, 0x0) /usr/local/go/src/runtime/checkptr.go:43 +0xb5 fp=0xc000054d98 sp=0xc000054d68 pc=0x438a95 github.com/spaolacci/murmur3.Sum32WithSeed(0xc000054e38, 0x1b, 0x20, 0xc000000000, 0x1b) /go/pkg/mod/github.com/spaolacci/[email protected]/murmur32.go:129 +0x8a fp=0xc000054e00 sp=0xc000054d98 pc=0xcf070a github.com/spaolacci/murmur3.Sum32(...) /go/pkg/mod/github.com/spaolacci/[email protected]/murmur32.go:111

jgheewala avatar Jul 27 '20 16:07 jgheewala

This is still broken in go1.15.5 darwin/amd64 for me. In particular, murmur32 is tripping checkptr even when the pointer is actually aligned. Seems like a bug in Go to me.

dragonsinth avatar Dec 23 '20 18:12 dragonsinth

I am facing the same issue

fatal error: checkptr: pointer arithmetic result points to invalid allocation

goroutine 46 [running]:
runtime.throw({0x1036bb2d3?, 0x3cb?})
	/Users/sheregeda/.gvm/gos/go1.19.8/src/runtime/panic.go:1047 +0x40 fp=0xc0002ad0f0 sp=0xc0002ad0c0 pc=0x102e03100
runtime.checkptrArithmetic(0x50?, {0x0, 0x0, 0xc0007da358?})
	/Users/sheregeda/.gvm/gos/go1.19.8/src/runtime/checkptr.go:69 +0xac fp=0xc0002ad120 sp=0xc0002ad0f0 pc=0x102dd063c
github.com/spaolacci/murmur3.Sum32WithSeed({0xc0002ad248, 0xd, 0x20}, 0x0)
	/Users/sheregeda/.gvm/pkgsets/go1.19.8/tabby-notify/pkg/mod/github.com/spaolacci/[email protected]/murmur32.go:129 +0x7c fp=0xc0002ad190 sp=0xc0002ad120 pc=0x1035f581c
github.com/spaolacci/murmur3.Sum32(...)
	/Users/sheregeda/.gvm/pkgsets/go1.19.8/tabby-notify/pkg/mod/github.com/spaolacci/[email protected]/murmur32.go:111
gitlab.com/tabby.ai/notify/internal/services/sms.generateSeedByPhone(...)
	/Users/sheregeda/work/tabby/notify/internal/services/sms/service.go:312
gitlab.com/tabby.ai/notify/internal/services/sms.shuffleRouting({0xc0003fc2d0, 0x3, 0x10368dca8?}, {0xc00011b830, 0x3, 0x102dd06d0?}, {0xc0007c9953, 0xd})
	/Users/sheregeda/work/tabby/notify/internal/services/sms/service.go:289 +0x100 fp=0xc0002ad350 sp=0xc0002ad190 pc=0x10365b860
gitlab.com/tabby.ai/notify/internal/services/sms.(*Service).getProviders(0xc000595bd0, {0x103a6b2f0, 0xc0001a6018}, {0x102e47940?}, 0x0)
	/Users/sheregeda/work/tabby/notify/internal/services/sms/service.go:205 +0x74c fp=0xc0002ad690 sp=0xc0002ad350 pc=0x10365ac2c
gitlab.com/tabby.ai/notify/internal/services/sms.(*Service).SendSMS(0xc000595bd0, {0x103a6b2f0, 0xc0001a6018}, {0x0?}, {0x10368c0af, 0x9}, 0x103f63960?)
	/Users/sheregeda/work/tabby/notify/internal/services/sms/service.go:121 +0x280 fp=0xc0002ad900 sp=0xc0002ad690 pc=0x103659c30
gitlab.com/tabby.ai/notify/internal/services/sms.(*ServiceTestSuite).Test_SendSMS.func2()
	/Users/sheregeda/work/tabby/notify/internal/services/sms/service_test.go:624 +0x58c fp=0xc0002ade50 sp=0xc0002ad900 pc=0x10366217c
github.com/stretchr/testify/suite.(*Suite).Run.func2(0x0?)
	/Users/sheregeda/.gvm/pkgsets/go1.19.8/tabby-notify/pkg/mod/github.com/stretchr/[email protected]/suite/suite.go:112 +0x4c fp=0xc0002ade90 sp=0xc0002ade50 pc=0x103642fec
testing.tRunner(0xc0004df040, 0xc0001bc780)
	/Users/sheregeda/.gvm/gos/go1.19.8/src/testing/testing.go:1446 +0x18c fp=0xc0002adfa0 sp=0xc0002ade90 pc=0x102f1c18c
testing.(*T).Run.func1()
	/Users/sheregeda/.gvm/gos/go1.19.8/src/testing/testing.go:1493 +0x44 fp=0xc0002adfd0 sp=0xc0002adfa0 pc=0x102f1d2f4
runtime.goexit()
	/Users/sheregeda/.gvm/gos/go1.19.8/src/runtime/asm_arm64.s:1172 +0x4 fp=0xc0002adfd0 sp=0xc0002adfd0 pc=0x102e38bf4
created by testing.(*T).Run
	/Users/sheregeda/.gvm/gos/go1.19.8/src/testing/testing.go:1493 +0x560

sheregeda avatar Nov 22 '23 14:11 sheregeda

@sheregeda did you read the thread? Especially https://github.com/spaolacci/murmur3/issues/29#issuecomment-607429181

The fork of @twmb seems quite nice: https://github.com/twmb/murmur3

oliverpool avatar Nov 22 '23 15:11 oliverpool