jep
jep copied to clipboard
Freeze when using SubInterpreter's shared modules and doing from scipy import ndimage with scipy 1.9+
Describe the bug We have encountered a freeze of a thread that is running with Jep's SubInterpreter and using shared modules for numpy and scipy. The Python code
import scipy
from scipy import ndimage
freezes on the import of ndimage. It works fine with scipy 1.8, but with scipy 1.9 and 1.10 it freezes. If you import directly with import scipy.ndimage
it works fine, only the from
import has the issue. Interestingly, in scipy 1.9 they reworked the import of submodules here: https://github.com/scipy/scipy/pull/15230. So that is probably related somehow.
To Reproduce
public static void main(String[] args) throws JepException {
JepConfig jepConfig = new JepConfig();
jepConfig.addSharedModules("numpy", "scipy");
try (Jep jep = new SubInterpreter(jepConfig)) {
jep.exec("import numpy");
jep.exec("import scipy");
System.out.println("Start ndimage import");
jep.exec("from scipy import ndimage");
System.out.println("End ndimage import");
}
}
Expected behavior The import does not freeze and scipy works ok.
Environment (please complete the following information):
- OS Platform, Distribution, and Version: tested against RHEL 7 and RHEL 8
- Python Distribution and Version: tested against Python 3.8 and 3.11
- Java Distribution and Version: OpenJDK 11
- Jep Version: tested against Jep 3.9.0 and 4.1.1
- Python packages used (e.g. numpy, pandas, tensorflow): numpy, scipy
Here is what we want to happen when a submodule like scipy.ndimage
is imported in a sub-interpreter:
- Python breaks apart
scipy.ndimage
and importsscipy
first. - The import hits the jep shared_modules_hook which sees that scipy is shared
- The shared_modules_hook sends the import to the main interpreter thread.
- The main interpreter imports scipy and returns
- The shared_modules_hook returns the scipy module from the main interpreter.
- Python will then attempt to import
scipy.ndimage
- The import hits the jep shared_modules_hook which sees that scipy is shared
- The shared_modules_hook sends the import to the main interpreter thread.
- The main interpreter imports scipy.ndimage and returns
- The shared_modules_hook returns the scipy.ndimage module from the main interpreter.
- Python returns scipy.ndimage from the shared_modules_hook
This process actually works fine if you do import scipy.ndimage
on a sub-interpreter. When you do from scipy import ndimage
things start to fall apart at step 6. In this case Python checks the scipy module for an existing attribute named ndimage. This ends up in the new scipy getattr function which uses importlib to import scipy.ndimage.
The problem is that importlib was imported when the module is created, so it is the importlib module from the main interpreter since the module was created on the main interpreter. The main interpreter doesn't have the shared_modules_hook so step #7 just doesn't happen. Instead the import proceeds normally on the sub-interpreter, using the importlib from the main interpreter. This doesn't actually run into problems until the numpy.ndimage module tries to import another module, at this point the import of numpy.ndimage._filter
goes through the importlib for the sub-interpreter and finds the shared_modules_hook which transfers the import to the main interpreter which freezes. I believe the freeze is because there are locks in importlib that prevent multiple threads from doing the same import. Since the sub-interpreter is using the importlib from the main interpreter it holds the locks so when an import is transferred to the actual main interpreter thread it cannot get the locks and freezes. I am not sure which locks are actually causing the problem.
As far as fixing the problem I am not sure we can actually fix it in Jep. My strongest recommendation is to switch to SharedInterpreter instead of SubInterpreter if you are using python modules which are incompatible with sub-interpreters. Shared Modules can provide a nice workaround in some cases but I do not think we can smooth out all the odd behavior in cases like this. If you control the python code executing in jep you could also just import scipy.ndimage
instead of from scipy import ndimage
It might be possible to install an import hook on the main interpreter that would detect if it is in a different interpreter but I am not sure we can do much after detecting a potential problem. If we could do a normal shared import from a hook on the main interpreter it would potentially fix the problem but I suspect trying to do that would lead to freezes, like it does now, because the sub-interpreter would already have the import locks for the main interpreter. It may be worth testing since I don't fully understand the locking mechanisms in importlib. If we could check the main interpreter import lock from the shared interpreter we could definitely prevent freezing but all we could really do is throw an exception. Again more research would be needed into the locking to see if that is even possible and I do not think it will lead to a full solution, just an error instead of a freeze.
If you don't mind modifying scipy code another workaround is to move the import of importlib into the __getattr__
function. That way it would use the importlib from the sub-interpreter rather than the importlib from the main interpreter.
Thank you for the analysis. On the system in question we could change this particular import to not be a from scipy import submodule
import, but ultimately that isn't a safe solution as the system can be extended by others with Python code that we don't have control over, and therefore it could easily be reintroduced.