Parla.py icon indicating copy to clipboard operation
Parla.py copied to clipboard

Some import mechanisms bypass a customized builtins.__import__

Open insertinterestingnamehere opened this issue 5 years ago • 7 comments

Although we can currently load CUDA into a VEC, we can't currently load cupy. The error shows up as an assertion failure in our custom import code. I've done some work on diagnosing this, and here's what I think is going on.

Some Python C API routines bypass a customized builtins.__import__. In particular, the PyImport_ImportModuleLevelObject does, and that's one of the C API routines used to implement cimports in Cython. This only hits us with cupy because our import override currently only cares about detecting imports that are the first to load any new submodule of a given base-level module. Our existing examples work fine because, in the wild, it's rare for a cimport to be that kind of first import while also being implemented as an API call that would bypass our modified __import__. IIRC in this case the bad import is when cupy.cuda.device triggers the first load of cupy_backends.cuda.libs.cublas. It's not the first load of stuff from cupy_backends, but prior imports from that module have already completed and no new import of anything in cupy_backends is already in-progress to catch the changes to sys.modules that result from the lazy import of cupy_backends.cuda.libs.cublas. See https://github.com/cupy/cupy/blob/890e40cfd29c2ea37d52fbbef3d2e7d7ceb105d7/cupy/cuda/device.pyx#L8 for the culprit.

There are a few ways to hack this particular case to work in the short term if we need to do that. (e.g., having an import of cupy also observe changes to cupy_backends), but I'd prefer to actually fix the problem. As I see it, there are two problems here:

  • Cython doesn't reliably respect overrides to __import__ with their cimport machinery.
  • The PyImport_* routines (other than PyImport_Import and things that call it) bypass our current overrides.

The first bullet point needs to be fixed upstream and will only partially fix the problems we're having with our modified import not always getting called, but I suspect taking care of that would be good enough to everything we actually need for demos to work right. This is also a fix that I suspect the Cython devs will be happy to have. I've started working on a patch for this.

The fix for the second bullet point is to set up overrides for PyImport_* (other than PyImport_Import and things that call it) that allow us to observe arbitrary calls to those functions. This will be more of a hassle to set up, but it's still doable. It's what's required to fully address this issue. In particular, we'll have to modify our stub library generation scripts so that they're aware of any overrides for stuff in libpython. There's also some subtlety with interpreter initialization order where, if the builtin import hasn't been changed yet, nothing special should happen.

Related to this issue: importlib.import_module and _frozen_importlib._gcd_import also bypass our __import__ override. Those interfaces aren't frequently used in library code, but it'd probably be worth overriding them too. Most of the work in our import override is done via a context manager so overriding these additional interfaces isn't hard.

Arguably we really only need to override _frozen_importlib._gcd_import instead of __import__ and importlib.import_module. That's an internal interface so I'm not 100% sold on this, but it's not likely to change in the forseable future and we've already hacked things in much worse ways than this. Overriding builtins.__import__ and importlib.import_module and not _frozen_importlib._gcd_import is also acceptable in my mind since _frozen_importlib._gcd_import is an internal interface and almost never used in the wild.

Another alternative that now seems like it might be viable: fix the upstream PyImport_* routines to fully respect overrides. I realized today that they are already inconsistent with the documentation. The documentation claims that PyImport_Import is the only routine that respects an override to builtins.__import__, but then their source code goes around and implements PyImport_ImportModule by calling right back into PyImport_Import (https://github.com/python/cpython/blob/622307142130d36a30644233441333247838af38/Python/import.c#L1218), so... maybe the Python devs would be amenable to a fix here? The only catch is that universally calling into the stuff in _frozen_importlib may hurt performance in some cases. The comment at https://github.com/python/cpython/blob/622307142130d36a30644233441333247838af38/Python/import.c#L1564-L1565 implies this, but doesn't specify when it would matter.

Minor follow-up here: after thinking this one over for a while, I think the best approach is to try addressing this upstream in cpython. Letting someone override __import__ but only selectively applying that override strikes me as a poor design choice and it seems like something they'd be open to fixing. The behavior is buggy enough that it seems unlikely to be a meaningful backcompat break. This also breaks standard use-cases for overriding __import__ like when people want to log import calls. Generally overriding __import__ is frowned upon these days, but I haven't figured out a reliable way to get what we want out of the new import hooks system.

Submitted upstream as https://bugs.python.org/issue43870. It seemed like a minor enough thing to just discuss on the issue tracker rather than their mailing list.

We can work around this for now by importing cupy like this:

import cupy_backends.cuda.libs.cublas, cupy_backends.cuda.libs.cusolver, cupy_backends.cuda.libs.cusparse, cupy

That lands me with some other esoteric error, but at least it's something.

Worth noting: this bug is the root cause of a performance pathology in the case where we've imported many many modules. The key pathological loop is https://github.com/ut-parla/Parla.py/blob/d82e6d573a5cd5b1e492a99a0106cd90e632961e/parla/multiload.py#L373. That loop is necessary right now as a partial guard against modules getting imported internal to a module in ways that aren't visible to our __import__ override.

Given the lack of response upstream probably a good plan of action would be to prepare a CPython patch for the C API functions that respects __import__ overrides, see if they respond, then LD_PRELOAD the corrected versions if the patch doesn't gain any traction. It seems like the kind of thing that they'd want to fix though. This kind of inconsistency doesn't seem desirable.