Call Py_FinalizeEx() when process exits
Fixes #186
Currently: if you use PyCall from a "side thread" (="not the main thread), when you exit the process does not exit. See #186 for more information.
Using this PR, the "side thread" may manually call PyCall.initialize before it exits. Then the main thread will exit properly. Unfortunately, it is not possible to automatically call PyCall.initialize in a side-thread, because at_exit only runs on the main thread, and there is no handler for thread.on_exit.
This PR:
- adds
PyCall.finalize(), which calls Py_FinalizeEx() - automatically calls
PyCall.finalize()at_exit, if initialized on the main thread
A secondary advantage of this PR: it cleans up python memory before exit, which might make it easier to use valgrind and other memory debugging tools.
Example using this PR:
#!/usr/bin/env ruby
side_thread = Thread.new do
require 'pycall'
PyCall.import_module('sys')
PyCall.finalize # if this line is commented out, the process will hang on exit
end
side_thread.join
#=> process exits!
Before this PR (=comment out PyCall.finalize), the process would never exit even after both threads exited.
This may still have issues with pandas 🐼: after the finalize, when the process exits, it segfaults after the at_exit handlers. I need to figure out how to debug this in lldb.
It might be necessary to unregister gc objects before calling Py_FinalizeEx. These are destroyed automatically at process exit, but the (PyObject *) pointers were previously invalidated by Py_FinalizeEx, so the process segfaults at exit.
Investigating Destructors
When a Ruby-refs-Python object is destroyed by Ruby:
PyCall.gcguard_table (class is gcguard_data_type in C)
Initialized when pycall.so starts:
- pycall_init_gcguard()
- PyCall.gcguard_table = gcguard_new()
- gcguard_new()
- TypedData_Make_Struct(0, struct gcguard, &gcguard_data_type, gg)
- gg->guarded_objects = st_init_numtable() # a "Hash"
- gcguard_new()
- PyCall.gcguard_table = gcguard_new()
When PyCall.gcguard_table is destroyed by Ruby:
- gcguard_data_type.function.dfree()
- gcguard_free(gcguard *gg)
- st_free_table(gg->guarded_objects)
- PyCall.gcguard_table = nil
- gcguard_free(gcguard *gg)
When a Python-refs-Ruby object is destroyed by Python:
- PyRuby_Type.tp_dealloc()
- PyRuby_dealloc_with_gvl()
- PyRuby_dealloc()
- pycall_gcguard_unregister_pyrubyobj()
- pycall_gcguard_delete(PyObject *pyobj)
- gcguard = rb_ivar_get(mPyCall, id_gcguard_table)
- gcguard_delete(gcguard, pyobj)
- pycall_gcguard_delete(PyObject *pyobj)
- pycall_gcguard_unregister_pyrubyobj()
- PyRuby_dealloc()
- PyRuby_dealloc_with_gvl()
pycall_gcguard_register(), does not appear to be used (?)
pycall_gcguard_register() registers weak-refs to Python objects, which call pycall_gcguard_delete() when they are destroyed. It does not appear to be used, but I have documented it anyway to be careful.
- pycall_gcguard_register(PyObject *pyobj)
- wref = Py_API(PyWeakref_NewRef)(pyobj, weakref_callback_pyobj);
- pycall_gcguard_aset(wref, obj)
- Later, when pyobj is garbage collected by python it will run:
- weakref_callback_pyobj
- gcguard_weakref_destroyed()
- pycall_gcguard_delete(PyObject *weakref)
- gcguard = rb_ivar_get(mPyCall, id_gcguard_table)
- gcguard_delete(gcguard, pyobj)
- weakref_callback_pyobj
Initializing pycall.so registers the weakref_callback_pyobj() callback:
- pycall_init_gcguard()
- weakref_callback_pyobj = Py_API(PyCFunction_NewEx)(&gcguard_weakref_callback_def, NULL, NULL);