wasmer-go icon indicating copy to clipboard operation
wasmer-go copied to clipboard

Example early exit will cause GC bug

Open dawn-to-dusk opened this issue 2 years ago • 4 comments

Describe the bug

When wasm program exits early, if the program does not finish immediately, a panic will occur when doing garbage collection.

Steps to reproduce

1.Download github.com/wasmerio/wasmer, cd wasmer dir, git checkout master, switch to master release.
2.Take example_early_exit_test as an example. Add the following active GC statements before the test ends:

func ExampleFunction_Call() {
	//Other code is not modified, omitted
	......

	fmt.Println("Exited early with:", err)

	// The following statements are add statement
	runtime.GC()
	time.Sleep(10*time.Second)
	// Output:
	// Compiling module...
	// Instantiating module...
	// Calling `run` function...
	// Exited early with: exit code: 1
}
  1. Run the test directly in Golang

Expected behavior

The test can execute successfully.

Actual behavior

Got panic as following:

=== RUN   ExampleFunction_Call
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x43512c2]

runtime stack:
runtime.throw(0x416c9f7, 0x2a)
	/usr/local/go/src/runtime/panic.go:1116 +0x72
runtime.sigpanic()
	/usr/local/go/src/runtime/signal_unix.go:726 +0x48c

goroutine 18 [syscall]:
runtime.cgocall(0x4122e50, 0xc0000306e0, 0x300)
	/usr/local/go/src/runtime/cgocall.go:133 +0x5b fp=0xc0000306b0 sp=0xc000030678 pc=0x4005f7b
github.com/wasmerio/wasmer-go/wasmer._Cfunc_wasm_trap_delete(0x4e04ef0)
	_cgo_gotypes.go:2028 +0x45 fp=0xc0000306e0 sp=0xc0000306b0 pc=0x4111225
github.com/wasmerio/wasmer-go/wasmer.newTrap.func1.1(0x4e04ef0)
	/Users/fangbaozhu/work/wasmer-go/wasmer/trap.go:38 +0x53 fp=0xc000030710 sp=0xc0000306e0 pc=0x411a873
github.com/wasmerio/wasmer-go/wasmer.newTrap.func1(0xc0001342c0)
	/Users/fangbaozhu/work/wasmer-go/wasmer/trap.go:38 +0x3d fp=0xc000030728 sp=0xc000030710 pc=0x411a8dd
runtime.call32(0x0, 0x416f888, 0xc00008e020, 0x1000000010)
	/usr/local/go/src/runtime/asm_amd64.s:540 +0x3e fp=0xc000030758 sp=0xc000030728 pc=0x406b39e
runtime.runfinq()
	/usr/local/go/src/runtime/mfinal.go:222 +0x1f4 fp=0xc0000307e0 sp=0xc000030758 pc=0x401aa14
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc0000307e8 sp=0xc0000307e0 pc=0x406cda1
created by runtime.createfing
	/usr/local/go/src/runtime/mfinal.go:156 +0x65

goroutine 1 [sleep]:
time.Sleep(0x2540be400)
	/usr/local/go/src/runtime/time.go:188 +0xbf
github.com/wasmerio/wasmer-go/examples.ExampleFunction_Call()
	/Users/fangbaozhu/work/wasmer-go/examples/example_early_exit_test.go:107 +0x5e5
testing.runExample(0x4167406, 0x14, 0x416f7a0, 0x416e5c9, 0x66, 0x0, 0x0)
	/usr/local/go/src/testing/run_example.go:62 +0x209
testing.runExamples(0xc000155ed0, 0x4257660, 0x7, 0x7, 0x0)
	/usr/local/go/src/testing/example.go:44 +0x1af
testing.(*M).Run(0xc000148000, 0x0)
	/usr/local/go/src/testing/testing.go:1346 +0x273
main.main()
	_testmain.go:55 +0x138

goroutine 20 [IO wait]:
internal/poll.runtime_pollWait(0xa355ee8, 0x72, 0x4193800)
	/usr/local/go/src/runtime/netpoll.go:222 +0x55
internal/poll.(*pollDesc).wait(0xc000124258, 0x72, 0x4193801, 0x4240890, 0x0)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:87 +0x45
internal/poll.(*pollDesc).waitRead(...)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Read(0xc000124240, 0xc000066000, 0x8000, 0x8000, 0x0, 0x0, 0x0)
	/usr/local/go/src/internal/poll/fd_unix.go:159 +0x1a5
os.(*File).read(...)
	/usr/local/go/src/os/file_posix.go:31
os.(*File).Read(0xc000122028, 0xc000066000, 0x8000, 0x8000, 0x20, 0x0, 0x0)
	/usr/local/go/src/os/file.go:116 +0x71
io.copyBuffer(0x4193760, 0xc00000e060, 0x4193680, 0xc000122028, 0xc000066000, 0x8000, 0x8000, 0xc00010e2a0, 0xc00002c7a8, 0x4008acb)
	/usr/local/go/src/io/io.go:409 +0x12c
io.Copy(...)
	/usr/local/go/src/io/io.go:368
testing.runExample.func1(0xc000122028, 0xc00010e360)
	/usr/local/go/src/testing/run_example.go:37 +0x85
created by testing.runExample
	/usr/local/go/src/testing/run_example.go:35 +0x176

Process finished with exit code 1

Additional context

I feel the problem is that there are two new traps when wasm is reporting error, one is NewTrap called when function is executing, and the other is newTrap calles in newErrorFromTrap, so there are two setFinalizers for gc. There will be two WASM_trap_DELETE, resulting in the above panic. In order to solve this problem, I removed SetFinalizer from the first time, so panic will not appear, but I don't know if this solution is feasible. #290

dawn-to-dusk avatar Aug 18 '21 01:08 dawn-to-dusk

I'm experiencing crashes with SIG 16 that happen some time after an early exit. I assume it also related to GC / memory checks.

edaniels avatar Sep 10 '21 18:09 edaniels

I experience it as well.

gzigzigzeo avatar Apr 20 '22 17:04 gzigzigzeo

Tried to investigate it a bit. I have the following debug log:

1. about to call the function
2. function_trampoline is about to call NewTrap (function.go:120)
3. NewTrap called, C.to_wasm_new_trap returned pointer 0x131a08ce0
4. newTrap called with trap &{0x131a08ce0 <nil>} pointer 0x131a08ce0
5. function_trampoline called NewTrap and got 0x131a08ce0
6. newErrorFromTrap is about to call with 0x131a089b0 (function.go:323)
7. newErrorFromTrap is called with 0x131a089b0
8. newTrap called with trap &{0x131a089b0 <nil>} pointer 0x131a089b0
9. SetFinalizer called with &{0x131a089b0 <nil>} pointer 0x131a089b0
10. SetFinalizer called with &{0x131a08ce0 <nil>} pointer 0x131a08ce0
fatal error: unexpected signal during runtime execution--- PASS: ExampleFunction_Call (0.01s)

[signal SIGSEGV: segmentation violation code=0x2 addr=0x3000000000000000 pc=0x1031e7db0]

Please notice that after the actual function call (log line 6), struct pointer is changed.

function.go:119 looks like:

	if err != nil {
		trap := NewTrap(hostFunction.store, err.Error())

		runtime.KeepAlive(trap)

		return trap.inner()
	}

function.go:317:

		trap := C.wasm_func_call(self.inner(), &arguments, &results)

		runtime.KeepAlive(arguments)
		runtime.KeepAlive(results)

		if trap != nil {
			return nil, newErrorFromTrap(trap)
		}

newErrorFromTrap:

func newErrorFromTrap(pointer *C.wasm_trap_t) *TrapError {
	trap := newTrap(pointer, nil)

	return &TrapError{
		message: trap.Message(),
		origin:  trap.Origin(),
		trace:   trap.Trace().frames,
	}
}

newTrap sets the finaliser, which calls wasm_trap_delete. Finally, we've got two finalisers for two different pointers, the only difference is that to_wasm_trap_new is never called for the second. I do not yet know why the pointer gets changed. However, if I replace newTrap in the latest code sample with trap := &Trap{_inner: pointer}, it still fails even the pointers are same:

C.to_wasm_trap_new pointer is 0x138406260
C.wasm_trap_deleter pointer is 0x138406260

If I use pointers returned by wasm_func_call, everything works as expected.

@Hywan Could you please give any advice here?

gzigzigzeo avatar Apr 20 '22 20:04 gzigzigzeo

Looks like I've found the problem. In function.rs, wasm_func_call:

        Err(e) => Some(Box::new(e.into()))

Where e.into() (returned value of type RuntimeError) calls RuntimeError.from_trap which produces the new value of type Trap with the new pointer, which, in turn, goes into the box. Original value returned from the function is erased. Hence, it does not need to be GC-d. I'll try to make the PR.

Debug logs:

    Finished release [optimized] target(s) in 19.07s
=== RUN   ExampleFunction_Call
We are going to return trap 0x1400000e1c8 (pointer: 0x14c806770)
Trap we got from the function 0x14c806770 !
from_trap is called
Trap we are sending to go side: 0x14c806440
Trap we received (go side): 0x14c806440
--- PASS: ExampleFunction_Call (0.00s)

gzigzigzeo avatar Apr 21 '22 00:04 gzigzigzeo

邮件已收到,谢谢啦~

dawn-to-dusk avatar Sep 21 '22 10:09 dawn-to-dusk