luv icon indicating copy to clipboard operation
luv copied to clipboard

Make all C -> Lua calls protected and allow user error handlers in uv.run()

Open creationix opened this issue 9 years ago • 10 comments

This task will fix #99.

The idea is to protect all calls into Lua from C using pcall. If there is an error, it will be handled by the current global error handlers. If there is no current error handler then we'll throw an unprotected lua error (most likely this will crash the process).

Since calls to uv.run() are blocking, then "nested" calls will be all "stacked" in the current C stack. Therefore users can have nested error handlers. A unit test framework may wish to handle all global errors, but only for the duration of it's test unit. It can call and block on uv.run() with a custom error handlers that pertains to that one test. When the test finishes and uv.run() returns, then the old error handler will be restored.

The new signature will be uv.run(mode, onError).

creationix avatar Mar 26 '15 20:03 creationix

Perhaps make the errors a ring. Just to make sure it isn't unbounded.

nm... I was confused.

rphillips avatar Mar 26 '15 20:03 rphillips

That's not a problem, errors thrown by error handlers won't feed back into themselves.

creationix avatar Mar 26 '15 20:03 creationix

My implementation idea is to add a couple global variables per luv instance. One will hold a reference to the current error handler function. The other will be a temporary place to hold the error value when an error happens.

The global error handler on the C side will simply store the error in the global place and unblock uv_run() with uv_stop(). Then the luv_run() wrapper will check the global error slot for a possible error.

creationix avatar Mar 26 '15 21:03 creationix

WIP at #129

creationix avatar Mar 26 '15 22:03 creationix

I decided this is a very deep rabbit-hole with little benifit. I instead made a simple change to lua_pcall all callback event sources and hard-error if there is an uncaught exception. All existing luv, luvi, and luvit unit tests still pass with this change since we almost always wrap our code in xpcall to get stack traces.

This change will simply enforce the current de-facto behavior and let you know if you forget to protect your event sources.

creationix avatar Mar 30 '15 17:03 creationix

Implemented in https://github.com/luvit/luv/commit/7526c1d9955f085df9eb1c25fcc31c0cfe67e583

creationix avatar Mar 30 '15 17:03 creationix

Probably should be reopened if you allow my humble five cents...

Catching runtime exceptions isn't the only reason why a C->Lua call should be protected. Studying your code, I can see that you pay no attention to the fact that many Lua C API functions may trigger out of memory exceptions. For example, lua_pushcfunction() that you use in luv_call_callback is prone to exceptions in Lua 5.1 due to the absence of light C functions (it always create a closure). In other words, any function that leads to memory allocation may fail (new strings, tables, string conversions, etc). One must pay attention to the types of exceptions a C API function generates as denoted in the Lua C API manual to the right of their names. What if one wants to confine memory usage of a Lua state and provides a custom lua_Alloc with some sort of accounting when luv is used as a library? Or is it impossible to create a custom Lua state?

Speaking more broadly, a pattern to reliably call a callback must be:

  1. lua_cpcall() into a callback wrapper with an argument pointing to a structure on the C stack which you fill with the outer callback's arguments;
  2. Inside the wrapper, it is safe to manipulate the stack, so fetch your Lua function, prepare its arguments and call it via lua_pcall() or even lua_call() since you are already protected;

Another problem to this is that in your code, you mix malloc() and the allocator of the Lua state (indirectly). Moreover, many calls to malloc() don't even bother to check if the allocation succeeded or just assert it's successful. The assumption that if a memory block cannot be allocated, the process must terminate is a brave one.:-) I'd recommend allocating internal structures via lua_newuserdata() for protection and consistency or at least use lua_getallocf() to fetch the current Lua state's allocator.

Loosely connected to the above, I can see that luv_alloc_cb() always allocates suggested_size which is always 64Kb by default. This is very slow and ineffective on most systems. I'd recommend buffer pooling.

neoxic avatar Dec 13 '18 02:12 neoxic

The assumption that if a memory block cannot be allocated, the process must terminate is a brave one.

I think of it as a non-feature to keep things simpler and easier to maintain. Is this constraint actually a problem for anyone? Does it prevent you from doing something you want to do?

That said, you bring up good suggestions. I'm just not sure who has the bandwidth to implement the change.

creationix avatar Dec 22 '18 20:12 creationix

Well, I understand that this is fine grained approach and for most people it sounds like overkill, but in fact, it is the way to do things right as it is done in Lua itself. For example, check the source of standalone Lua to see that it first obtains protection via lua_cpcall (or lua_pcall for Lua 5.3) and only then manipulates the stack - loads chunks, pushes arguments, etc.

Based on years of direct experience with daemons running Lua code 24/7, we constantly run into situations when inadvertent Lua memory leaks happen (missing table clean ups, recursions, etc.). Lua itself is durable enough not to crash in such cases. Some measures are taken to ease things a bit internally like emergency garbage collection, gradual approach in allocation for LuaJIT, etc. But allocation related design flaws in the host code will definitely lead to a crash/termination of such a process which means it will be much harder to investigate what happened while it could have been possible to try to debug things live.

Another case is when a Lua state is sandboxed in terms of memory (may be not related to luv though if it's not possible by design to provide an already created Lua state to it). Imagine when a custom allocation function is provided to a Lua state upon creation and a memory exception happens (memory limit has been exceeded and the allocator deliberately returned NULL), the whole unprotected thing finds itself in the default panic function with a broken event loop because a long jump has just been made. By default, exit() is called after panic which means we lose the process. One can think of long jumping somewhere to avoid it, but that is useless anyways due to undefined state of the loop.

Hope this helps.

neoxic avatar Dec 23 '18 05:12 neoxic

Another unrelated thing (which probably requires a separate issue to be opened) is that one must be very careful when converting stack values to strings via any of lua_tostring, luaL_checkstring, luaL_optstring and other functions that may cause coercion. For example, I can see that in your code when a table of strings is written to a stream, there is no protection that an element might in fact be a number. When a uv_buf_t vector is being initialized, the item is converted to a string on the stack and its pointer is saved in the vector. All seems good. However, after the conversion is made and before the actual string is written to a socket in a uv_write_t request, garbage collection may happen effectively invalidating the saved pointer because its source string existed briefly on the stack as a result of coercion (the item in the table is still a number, not a string). And that means in turn that undefined data will actually be written.

Btw, the table that is used to form a uv_buf_t vector may be altered even after the call directly in Lua which will lead to the same outcome. There is no protection against it.

I haven't checked through your code much, but there are many other caveats when using Lua C API that one must be aware of. They are all more or less mentioned in the manual though.

neoxic avatar Dec 23 '18 05:12 neoxic