sol2 Sol needs a coroutine safety guide

Following from the discussion in https://github.com/ThePhD/sol2/issues/890#issuecomment-3555487828 and https://github.com/ThePhD/sol2/issues/1711#issuecomment-3556723670 ...

Sol's tutorials depict an approach to writing script bindings that can lead to heap corruption if those bindings are invoked from a coroutine, even if the coroutine is created and operated by scripts. This can come as a nasty surprise to library users who neglect to read the documentation subpages about threads because they aren't using sol's coroutine API.

The technical problem as I understand it — take this with a grain of salt: Heap corruption can occur whether the coroutine is created and operated by scripts or the host app, so long as the host app either stores sol::reference-based values somewhere in its own datamodel or (worse yet) passes native data types containing any type of sol::reference back to a script. Sol bindings written in the 'standard style' will tend to use the lua_State of the coroutine that invoked them, and create these references in a non-main registry. When the coroutine completes, that registry may be deleted or recycled, causing all of these references go out of scope. This invalidates those sol::references; what happens next is undefined behavior but typically corrupts a heap or freelist causing a crash with unpredictable timing deep in the runtime.

This is a major pitfall, responsible for a number of issues on the repo. I think it could be resolved by adding a prominent item to Sol's table of contents with a title like "how to write coroutine-safe code".

This could either lead to an updated version of the thread page or to a new page written as a guide. Ideally it should explain when and why types like main_reference and this_main_state are necessary without venturing into topics like CPU threading.

Nov 22 '25 02:11 EvanBalster

"Heap corruption" might be a little too far. What you create are dangling references (and using them is a use-after-free). In my opinion, the page should include:

Each coroutine has its own Lua state
References (e.g. tables or functions) are stored in the "current" Lua state.
References prefixed with main_ are stored in the main Lua state.
If a reference is used across Lua invocations (i.e. create reference → Lua executes → use reference), almost always, you'd want the main_ variant. This assumes you're running unknown code.
Examples of where you can accidentally create dangling references.
Examples of how to do it correctly.
Mention that creating usertypes should almost always use a main_state (because these might create references that should be visible everywhere)

Nov 22 '25 09:11 Nerixyz

I am a little confused now because I've come across some apocrypha that says coroutines do use the same registry as the main lua_State, but this seems to conflict with my firsthand experience using sol with luajit. Perhaps the coroutine's registry "inherits" from the main one..? I would write a guide myself if I had a stronger understanding of the specifics.

The main offender in my program was a C++ function exposed to Lua that returned a class containing a collection of Lua references. That class was never retained by C++ code, instead going into the possession of Lua's garbage collector. If the coroutine ended, the references (created from the coroutine state) would cause issues when the garbage collector finalized the C++ object. This caused some kind of double-free heisenbug which would variously corrupt either the registry's freelist or luajit's bulk allocator heap.

Here's a stripped-down version of that "coroutine-unsafe" code:

// An ever-changing list we expose to Lua from time to time.
static std::vector<Bogo*> bogos = {...};

// A userdata type that represents a snapshot of the list.  Pretend it has some useful methods.
struct LuaBogos {std::vector<sol::userdata> bogos;};

// A function Lua scripts can call to get the snapshot.
LuaBogos get_bogos(sol::this_state lua)
{
   LuaBogos result;
   for (auto &bogo : bogos) result.bogos.emplace_back(sol::make_userdata(lua, bogo));
   return result;
}

Nov 24 '25 00:11 EvanBalster