Parla.py icon indicating copy to clipboard operation
Parla.py copied to clipboard

Some import mechanisms bypass a customized builtins.__import__

Open insertinterestingnamehere opened this issue 4 years ago • 7 comments

Although we can currently load CUDA into a VEC, we can't currently load cupy. The error shows up as an assertion failure in our custom import code. I've done some work on diagnosing this, and here's what I think is going on.

Some Python C API routines bypass a customized builtins.__import__. In particular, the PyImport_ImportModuleLevelObject does, and that's one of the C API routines used to implement cimports in Cython. This only hits us with cupy because our import override currently only cares about detecting imports that are the first to load any new submodule of a given base-level module. Our existing examples work fine because, in the wild, it's rare for a cimport to be that kind of first import while also being implemented as an API call that would bypass our modified __import__. IIRC in this case the bad import is when cupy.cuda.device triggers the first load of cupy_backends.cuda.libs.cublas. It's not the first load of stuff from cupy_backends, but prior imports from that module have already completed and no new import of anything in cupy_backends is already in-progress to catch the changes to sys.modules that result from the lazy import of cupy_backends.cuda.libs.cublas. See https://github.com/cupy/cupy/blob/890e40cfd29c2ea37d52fbbef3d2e7d7ceb105d7/cupy/cuda/device.pyx#L8 for the culprit.

There are a few ways to hack this particular case to work in the short term if we need to do that. (e.g., having an import of cupy also observe changes to cupy_backends), but I'd prefer to actually fix the problem. As I see it, there are two problems here:

  • Cython doesn't reliably respect overrides to __import__ with their cimport machinery.
  • The PyImport_* routines (other than PyImport_Import and things that call it) bypass our current overrides.

The first bullet point needs to be fixed upstream and will only partially fix the problems we're having with our modified import not always getting called, but I suspect taking care of that would be good enough to everything we actually need for demos to work right. This is also a fix that I suspect the Cython devs will be happy to have. I've started working on a patch for this.

The fix for the second bullet point is to set up overrides for PyImport_* (other than PyImport_Import and things that call it) that allow us to observe arbitrary calls to those functions. This will be more of a hassle to set up, but it's still doable. It's what's required to fully address this issue. In particular, we'll have to modify our stub library generation scripts so that they're aware of any overrides for stuff in libpython. There's also some subtlety with interpreter initialization order where, if the builtin import hasn't been changed yet, nothing special should happen.

Related to this issue: importlib.import_module and _frozen_importlib._gcd_import also bypass our __import__ override. Those interfaces aren't frequently used in library code, but it'd probably be worth overriding them too. Most of the work in our import override is done via a context manager so overriding these additional interfaces isn't hard.