golua icon indicating copy to clipboard operation
golua copied to clipboard

C stack overflow in error handler

Open jcelliott opened this issue 11 years ago • 11 comments

When calling a lua function that errors, the C stack is not cleaned up. If the lua function is called many times, it will eventually overflow the stack. Any further calls to the error handler result in the error "error in error handling" being returned.

This is the smallest test case I could come up with to reproduce it. I'm on Linux using lua 5.1.5 and the most recent version of golua.

error.go:

package main

import "github.com/aarzilli/golua/lua"
import "fmt"

func main() {
    L := lua.NewState()
    L.OpenLibs()
    if err := L.DoFile("error.lua"); err != nil {
        panic(err)
    }

    // 120 worked on my machine, might be different for you
    for i:=0; i<120; i++ {
        top := L.GetTop()
        L.GetGlobal("doError")
        err := L.Call(0, lua.LUA_MULTRET)
        if err != nil {
            fmt.Println("error:", err)
            fmt.Println("stack length:", len(L.StackTrace()))
            // fmt.Println("stack:", L.StackTrace())
        }
        L.SetTop(top)
    }
}

error.lua:

function doError()
    error("something bad")
end

jcelliott avatar Jan 31 '14 01:01 jcelliott

Is there any progress on this? This is kind of a big deal for long-running servers that use lua scripting for plugins (that are always loaded).

If you don't have time to fix this now, could you explain where to look to fix it?

beatgammit avatar Feb 19 '14 01:02 beatgammit

It's pretty complicated, the root of the problem is that when luavm hits an error it will longjump all the way back to the last pcall. We can't let this happen, in general because if between the last pcall and the call to lua_error there is a go function the go runtime will eventually find out that the C stack changed under its feets and choke. What I did was install a lua error handler that bypassed lua's error handling entirely using a call to panic, this is what causes your bug: the error handler never returns and the C stack is never rolled back. It wouldn't be so much of a problem if lua_error were only called through lua code, unfortunately there are several lua API calls that can end up calling lua_error if the call was not set up correctly.

aarzilli avatar Feb 19 '14 17:02 aarzilli

I'm working around this now by closing and re-initializing scripts when the "C stack overflow" error is reported. I don't see any memory leakage on initial tests, so this may be acceptable.

Perhaps this should be noted in the README?

beatgammit avatar Feb 22 '14 08:02 beatgammit

I replace lua error with my own error handler, and find no “c stack overflow”, maybe.

func doError(l *lua.State) int {
    message := l.ToString(-1)

    panic(fmt.Errorf(message))
}

L.Register("error", doError)

siddontang avatar Sep 03 '14 00:09 siddontang

https://github.com/navy1125/golua I replace panic to error string , it can resolve this problem ugly...

I'm pleasure to put a pull request if nessary

navy1125 avatar Sep 07 '15 18:09 navy1125

Could you split the go fmt from the changes? It's hard to see what you patched like this. Thank you.

aarzilli avatar Sep 08 '15 07:09 aarzilli

sorry ,last two commits

navy1125 avatar Sep 09 '15 10:09 navy1125

There are two problems with this approach:

  1. changing the behavior of lua.(*State).RaiseError like you did breaks API backwards compatibility (this could be overlooked or worked around if problem n. 2 didn't exist)
  2. this approach only works for errors raised from go, if lua raises an error and we let lua's error handling continue it will do a longjmp which will make cgo choke. This can be seen by running TestCheckStringFail in lua/lua_test.go.

aarzilli avatar Sep 09 '15 12:09 aarzilli

I have a fix for this issue: https://github.com/aarzilli/golua/pull/70

andrewhare avatar Jun 06 '19 17:06 andrewhare

@aarzilli the correct (but somewhat annoying) solution is to wrap FFI calls in both directions, with wrappers written in the foreign language that handle converting the exception-propagation mechanism appropriately.

Specifically:

  • Lua->Go calls: for any called Go function that might panic, wrap it in a Go function that recovers the panic, convert it to an error value, return it to the Lua side, and have the Lua code convert it to a Lua error*
  • Go->Lua calls: in any place where a Lua C API call may trigger a longjmp, you have to wrap that call with a lua_pcall (thus effectively capturing the error and converting it to a value before returning to the Go caller), and have the Go code convert any returned error value to a Go error

I believe this should ensure that the stack is always unwound appropriately, and never incorrectly done across FFI boundaries. This approach is taken by some other existing Lua embedding/binding projects.


*: Or, possibly store the information about the panic in a variant error, and modify Lua's pcall/xpcall so they can't catch these special panic-wrapped-in-errors, and have your Go->Lua wrapper convert them back to panics at the original Go caller

Shados avatar Sep 06 '19 12:09 Shados

As far as I can see, the only problem with error handling is the case when an error is raised from the Go side. In that case the Go stack is left unwound because of longjmp which results in a corrupted syscall frame. But the errors raised from the Lua side seem to be safe even if there is a completed Go function call between Lua pcall and error.

So what if we remove the panic msghandler, move the error call to the Lua side and make sure that no C API calls that can raise an error are ever made in the lib?

lua.(*State).PCall then would become something like this:

func (L *State) PCall(nargs, nresults int) error {
	r := L.pcall(nargs, nresults, 0)
	if r != 0 {
		msg := L.ToString(-1)
		st := L.StackTrace()
		L.Pop(1)
		return &LuaError{r, msg, st}
	}
	return nil
}

The "move error to the Lua side" part could be implemented by using debug hooks called on function return and a global variable containing an error:

debug = require("debug")
debug.sethook(function()
	if _LUAGO_ERROR ~= nil then
		local err = _LUAGO_ERROR
		_LUAGO_ERROR = nil
		error(err)
	end
end, "r")

On the client we will need to set the value of the global variable and successfully complete the call:

L.Register("foo", func(L *State) int {
	if !L.IsNumber(-1) {
		v := L.ToString(-1) // won't handle anything other than strings, but it is just an example
		L.PushString(fmt.Sprintf("oh no, %q is not a number", v))
		L.SetGlobal("_LUAGO_ERROR")
	}
	return 0
})

L.GetGlobal("foo")
L.PushString("oops")
if err := L.PCall(1, 0); err != nil {
	fmt.Println("error:", err)
}

Please, correct me if I am missing something.

rnovatorov avatar Aug 27 '21 20:08 rnovatorov