sonic
sonic copied to clipboard
Panic: invalid pc-encoded table
Hello guys. We have a problem in our service and sonic appeared in trace. Unfortunately it happens only in production environment so I don't know how to reproduce this behavior. Go version in our project - 1.18 Sonic version - 1.5.0 Can you help us to analyze the problem? If you need more information - just say it. If it's not because of sonic than sorry. You project is really amazing and we love it! ❤️
Stack trace:
runtime: invalid pc-encoded table f=google.golang.org/protobuf/internal/impl.(*MessageInfo).checkInitialized-fm pc=0x7d78cf targetpc=0x7d78d3 tab=[0/0]0x0
value=0 until pc=0x7d786a
value=32 until pc=0x7d78a2
value=0 until pc=0x7d78be
value=32 until pc=0x7d78cf
fatal error: invalid runtime symbol table
goroutine 0 [idle]:
runtime: unexpected return pc for runtime.sigreturn called from 0x7
stack: frame={sp:0xc00010b380, fp:0xc00010b388} stack=[0xc000104000,0xc00010c000)
0x000000c00010b280: 0x000000c00010b2f8 0x000000000044dd05 <runtime.sigtrampgo+0x00000000000001a5>
0x000000c00010b290: 0x000000c00000001b 0x000000c000100000
0x000000c00010b2a0: 0x000000c00010b2b8 0x000000c156c09ba0
0x000000c00010b2b0: 0x0000000000000000 0x0000000000000000
0x000000c00010b2c0: 0x0000000000000000 0x0000000000000000
0x000000c00010b2d0: 0x0000000000000000 0x0000000000000000
0x000000c00010b2e0: 0x000000c04a7b5a00 0x000000c00010b4b0
0x000000c00010b2f0: 0x000000c00010b380 0x000000c00010b320
0x000000c00010b300: 0x000000000046e7ee <runtime.sigtrampgo+0x000000000000002e> 0x000000000000001b
0x000000c00010b310: 0x000000c00010b4b0 0x000000c00010b380
0x000000c00010b320: 0x000000c00010b370 0x000000000046dcfd <runtime.sigtramp+0x000000000000003d>
0x000000c00010b330: 0x000000000000001b 0x000000c00010b4b0
0x000000c00010b340: 0x000000c00010b380 0x000000c1d13948db
0x000000c00010b350: 0x0000000000000000 0x0000000000000000
0x000000c00010b360: 0x00000000000000ef 0x000000c00213b6d8
0x000000c00010b370: 0x000000c00213b658 0x000000000046de00 <runtime.sigreturn+0x0000000000000000>
0x000000c00010b380: <0x0000000000000007 >0x0000000000000000
0x000000c00010b390: 0x000000c000104000 0x0000000000000000
0x000000c00010b3a0: 0x0000000000008000 0x00000000011a7150 <github.com/bytedance/sonic/internal/native/avx2.__native_entry__+0x000000000000e290>
0x000000c00010b3b0: 0x00000000011a9150 <github.com/bytedance/sonic/internal/native/avx2.__native_entry__+0x0000000000010290> 0x00000000000ff77d
0x000000c00010b3c0: 0x000000c00213b6e8 0x0000000000000000
0x000000c00010b3d0: 0x0000000000000000 0x00000000000000ef
0x000000c00010b3e0: 0x000000c00213b6d8 0x000000c169519d5b
0x000000c00010b3f0: 0x00000000000000a4 0x000000c00213b658
0x000000c00010b400: 0x000000c1d13948db 0x000000c1d1394883
0x000000c00010b410: 0x0000000000000076 0x00000000000000a4
0x000000c00010b420: 0x000000c00213b618 0x000000000119afdf <github.com/bytedance/sonic/internal/native/avx2.__native_entry__+0x000000000000211f>
0x000000c00010b430: 0x0000000000000206 0x002b000000000033
0x000000c00010b440: 0x0000000000000000 0x0000000000000000
0x000000c00010b450: 0x0000000000000000 0x0000000000000000
0x000000c00010b460: 0x000000c00010b540 0x0000000000000000
0x000000c00010b470: 0x0000000000000000 0x0000000000000000
0x000000c00010b480: 0x0000000000000000
runtime.throw({0x1b16e2e?, 0x0?})
/usr/local/go/src/runtime/panic.go:992 +0x71
runtime.pcvalue({0x283a320?, 0x2ba9320?}, 0x9da15, 0x7d78d3, 0xc00010ad68, 0x1)
/usr/local/go/src/runtime/symtab.go:963 +0x57a
runtime.funcspdelta({0x283a320?, 0x2ba9320?}, 0x0?, 0x0?)
/usr/local/go/src/runtime/symtab.go:1038 +0x34
runtime.gentraceback(0x0?, 0xc00010afd0?, 0x3a?, 0x40?, 0x0, 0xc00010afd0, 0x40, 0x0, 0x0?, 0x6)
/usr/local/go/src/runtime/traceback.go:191 +0x67c
runtime.sigprof(0x119afdf, 0xc00010b270?, 0x0?, 0xc04a7b5a00, 0xc000100000)
/usr/local/go/src/runtime/proc.go:4507 +0x108
runtime.sighandler(0x1b?, 0xc000100000?, 0xc00010b2b8?, 0xc156c09ba0?)
/usr/local/go/src/runtime/signal_unix.go:613 +0x5e6
runtime.sigtrampgo(0x1b, 0xc00010b4b0, 0xc00010b380)
/usr/local/go/src/runtime/signal_unix.go:477 +0x1a5
runtime.sigtrampgo(0x1b, 0xc00010b4b0, 0xc00010b380)
<autogenerated>:1 +0x2e
runtime.sigtramp()
/usr/local/go/src/runtime/sys_linux_amd64.s:361 +0x3d
runtime: unexpected return pc for runtime.sigreturn called from 0x7
stack: frame={sp:0xc00010b380, fp:0xc00010b388} stack=[0xc000104000,0xc00010c000)
0x000000c00010b280: 0x000000c00010b2f8 0x000000000044dd05 <runtime.sigtrampgo+0x00000000000001a5>
0x000000c00010b290: 0x000000c00000001b 0x000000c000100000
0x000000c00010b2a0: 0x000000c00010b2b8 0x000000c156c09ba0
0x000000c00010b2b0: 0x0000000000000000 0x0000000000000000
0x000000c00010b2c0: 0x0000000000000000 0x0000000000000000
0x000000c00010b2d0: 0x0000000000000000 0x0000000000000000
0x000000c00010b2e0: 0x000000c04a7b5a00 0x000000c00010b4b0
0x000000c00010b2f0: 0x000000c00010b380 0x000000c00010b320
0x000000c00010b300: 0x000000000046e7ee <runtime.sigtrampgo+0x000000000000002e> 0x000000000000001b
0x000000c00010b310: 0x000000c00010b4b0 0x000000c00010b380
0x000000c00010b320: 0x000000c00010b370 0x000000000046dcfd <runtime.sigtramp+0x000000000000003d>
0x000000c00010b330: 0x000000000000001b 0x000000c00010b4b0
0x000000c00010b340: 0x000000c00010b380 0x000000c1d13948db
0x000000c00010b350: 0x0000000000000000 0x0000000000000000
0x000000c00010b360: 0x00000000000000ef 0x000000c00213b6d8
0x000000c00010b370: 0x000000c00213b658 0x000000000046de00 <runtime.sigreturn+0x0000000000000000>
0x000000c00010b380: <0x0000000000000007 >0x0000000000000000
0x000000c00010b390: 0x000000c000104000 0x0000000000000000
0x000000c00010b3a0: 0x0000000000008000 0x00000000011a7150 <github.com/bytedance/sonic/internal/native/avx2.__native_entry__+0x000000000000e290>
0x000000c00010b3b0: 0x00000000011a9150 <github.com/bytedance/sonic/internal/native/avx2.__native_entry__+0x0000000000010290> 0x00000000000ff77d
0x000000c00010b3c0: 0x000000c00213b6e8 0x0000000000000000
0x000000c00010b3d0: 0x0000000000000000 0x00000000000000ef
0x000000c00010b3e0: 0x000000c00213b6d8 0x000000c169519d5b
0x000000c00010b3f0: 0x00000000000000a4 0x000000c00213b658
0x000000c00010b400: 0x000000c1d13948db 0x000000c1d1394883
0x000000c00010b410: 0x0000000000000076 0x00000000000000a4
0x000000c00010b420: 0x000000c00213b618 0x000000000119afdf <github.com/bytedance/sonic/internal/native/avx2.__native_entry__+0x000000000000211f>
0x000000c00010b430: 0x0000000000000206 0x002b000000000033
0x000000c00010b440: 0x0000000000000000 0x0000000000000000
0x000000c00010b450: 0x0000000000000000 0x0000000000000000
0x000000c00010b460: 0x000000c00010b540 0x0000000000000000
0x000000c00010b470: 0x0000000000000000 0x0000000000000000
0x000000c00010b480: 0x0000000000000000
runtime.sigreturn()
/usr/local/go/src/runtime/sys_linux_amd64.s:466
Can you give me some usage codes? It seems being introduced by a corrupt memory pointer -- Is there any unsafe pointer in your decoded|encoded object?
Sonic has widespread usage in our project. So it may be a little bit hard. But let's try.
First , unsafe in our project:
Only this one
// BytesToString ...
func BytesToString(item []byte) string {
return *(*string)(unsafe.Pointer(&item))
}
About sonic usage, we are using it for marshaling big proto structures and also map[string]interface{} (some values are proto structures)
Actually BytesToString
is quite an unsafe behavior, maybe the item
byte-slice is used somewhere concurrently, or has got freed by GC. You should make sure it is fully secure for sonic once you passed it
Hello again 👋🏻 It still panics :(
So we dropped unsafe code from our project. The only unsafe that exists is unsafe in protobuf internals. Also I can add that this issues appeared in our other project.(unsafe is only in protobuf too)
So could you at least try to research this problem? I can add more stack traces if you need. Unfortunately, still have no idea how to reproduce it.
Can u try to integrate your practical codes as a UT and run go test -race -gcflags=-d=checkptr
to check if there is any panic?
@AsterDY It's SIGPROF again, as shown in the stack backtrace.
@OlegElizarov Do you have pprof of some kind of profiler attached to your program? It’s a known issue that Sonic is not very compatible with SIGPROF. We got a few reports, and still working on this.
Hello again. Thank for your help guys.
@AsterDY , we tried to add (-race -gcflags=-d=checkptr ) but there was no problems with sonic in this mode. Now we are trying to add gotraceback=crash to get coredump but there is a problem that after pc-encoded table panic our pod restarts and we don't know how to save dump now. But we are researching it.
@chenzhuoyu , yep , pprof is always on in our service.(some kind of rule in our corporation ) Is it possible to do something ? (some workaround)
Go pprof won't make sonic crash which we have tried... Maybe it's introduce by other kind of profiling
At present, since Go ASM doesn't support generating pcsp
pcdata for stack-unstable functions, the program will crash if the runtime do gentraceback()
while sonic's native func is running. We are trying to resolve this problem but still need a lot of work to do.
The root cause may be the output of asm2asm.py
contains many pushq/popq
instructions but these instructions are encoded via .byte/.word/.long
directives. The Go assembler uses a trivial algorithm to compute PCSP. In my understanding, it does a linear scan of instructions which implicitly requires that all push/pop pairs must be in the same basic block.
- https://github.com/golang/go/blob/a4d5fbc3a48b63f19fcd2a4d040a85c75a2709b5/src/cmd/internal/obj/x86/obj6.go#L843
pushq // +8
if <>
pop // -8
else
pop // -8
Since it may be difficult to ask Golang to improve its assembler, a possible solution is to use attribute naked
for C functions (perhaps by modifying emitted LLVM IR) and insert C prologue/epilogue manually without using push
and pop
in asm2asm.py
.
- https://llvm.org/docs/LangRef.html#id1695
Note that this still cannot prevent LLVM from generating instructions that adjust stacks.
- https://github.com/bytedance/sonic/blob/f87d87de7ac6a5d01e3e39bfbdf0999473bfd6d9/internal/native/avx2/native_amd64.s#L7955
Another potential issue is the stack alignment. In amd64
linux, the stack alignment is 16 byte so that movaps
can operate on stack memory operands without stack realignment.
- https://leria-info.univ-angers.fr/~jeanmichel.richer/assembly/doc/movaps.pdf
- https://gitlab.com/x86-psABIs/x86-64-ABI
- https://github.com/llvm/llvm-project/blob/e5906f64a63d064e6a9ea2c46ebc9c285ca02bd1/llvm/lib/Target/X86/X86Subtarget.cpp#L288
The weird thing is that Go aligns stack to 8 bytes on amd64.
- https://github.com/golang/go/blob/master/src/cmd/compile/abi-internal.md#stack-layout
Go realigns the stack when making call to c via cgo.
- https://github.com/golang/go/blob/a4d5fbc3a48b63f19fcd2a4d040a85c75a2709b5/src/runtime/asm_amd64.s#L797
Fortunately, there are only a few potential dangerous instructions in native_amd64.s
, which are also guarded by stack realignment.
- https://github.com/bytedance/sonic/blob/f87d87de7ac6a5d01e3e39bfbdf0999473bfd6d9/internal/native/avx2/native_amd64.s#L8264
- https://github.com/bytedance/sonic/blob/f87d87de7ac6a5d01e3e39bfbdf0999473bfd6d9/internal/native/avx2/native_amd64.s#L7955
fixed in #644