PythonCall.jl icon indicating copy to clipboard operation
PythonCall.jl copied to clipboard

Do not load libpython when precompiling

Open cjdoris opened this issue 2 months ago • 2 comments

I propose that when PythonCall is being precompiled, we do not load/initialise libpython. Instead, have shims for the C functions that we use which raise an error or do something else trivial when used.

This means that we do not need to resolve a CondaPkg environment when precompiling, which will make precompilation much faster, simpler and more reliable.

This is breaking because it restricts which Python operations you are allowed to perform during __init__. I suggest that we make all of our shims return errors (clearly stating that you cannot perform this Python operation during module initialisation) except for those which return Python objects, which should return PyPtr(1) (or anything non-NULL) instead.

In particular, we should allow these operations at init:

  • incref, decref, initialize, finalize and other such functions we only use internally
  • pyimport, pygetattr, pygetitem, pyrepr, pystr, pycall, pyhash (always return 1)
  • creation of python objects from nothing, bool, string, number, etc. (Py)
  • GIL handling

And we'll need to disallow the following functions, which require an actual Python interpreter to return reasonable answers:

  • pylen, pyhasattr, pynext (PyIter_Next)
  • pyconvert (such as PyLong_AsLongLong etc)

Concretely, we can make shim functions like this:

shim_error() =
    error("You cannot perform this Python operation during module initialisation.")

shim_PyObject_IsTrue(::PyPtr) = shim_error()
shim_PyObject_Length(::PyPtr) = shim_error()
...

shim_Py_IncRef(::PyPtr) = nothing
shim_PyImport_ImportModule(::Ptr{Cchar}) = PyPtr(1)
shim_PyObject_GetAttr(::PyPtr, ::PyPtr, ::PyPtr) = PyPtr(1)
shim_PyLong_FromLongLong(::Clonglong) = PyPtr(1)
...

And then in init_pointers set the fields of POINTERS like this:

p.PyObject_IsTrue = @cfunction(PyObject_IsTrue_shim, Cint, (PyPtr,))
p.PyImport_ImportModule = @cfunction(PyImport_ImportModule_shim, PyPtr, (Ptr{Cchar},))
...

We should also initialise all the exceptions and other object pointers to PyPtr(1).

cjdoris avatar Oct 23 '25 14:10 cjdoris

I do worry a little bit about how this will effect precompilation workloads, anecdotally I found that a somewhat intensive workload significantly cut down on the TTFX for the IJulia extension: https://github.com/JuliaLang/IJulia.jl/blob/13db510a2582d394c8ad25a78ae3d5d0450f5728/ext/IJuliaPythonCallExt.jl#L340

Would it be possible to make it a preference or something, just in case people want to opt-in to loading the full interpreter?

JamesWrigley avatar Nov 14 '25 10:11 JamesWrigley

Yeah sure.

This is only a speculative feature, it might not come to anything. In particular, it seems AFAIU that a package's __init__() does not get called when precompiling, only the inits of packages it imports get called.

This means that if you are developing your package, you will not discover that you made some illegal calls in your init function until a downstream package gets precompiled. I think that makes this proposal a non starter.

It would be nice if CondaPkg could resolve its environment at install time via some hook in Pkg but that's not currently possible.

cjdoris avatar Nov 14 '25 10:11 cjdoris