meson-python icon indicating copy to clipboard operation
meson-python copied to clipboard

meson-python editable install method is not compatible with `jinja2.PackageLoader`

Open WillAyd opened this issue 1 year ago • 13 comments

I noticed when trying to upgrade the pandas CI to use more recent versions of meson-python that I started to see jinja2 failures in our test suite. As far as I can tell, jinja2 resolves paths to the package by loading the module specification via calls to importlib.util.find_spec("pandas")

In meson-python < 0.16.0, the debugger shows that specification as:

ModuleSpec(name='pandas', loader=<_pandas_editable_loader.SourceFileLoader object at 0x7ede6cb92e60>, origin='/home/willayd/clones/pandas/pandas/__init__.py', submodule_search_locations=[])

With meson-python 0.16.0, that same specification now shows as:

ModuleSpec(name='pandas', loader=<_pandas_editable_loader.SourceFileLoader object at 0x7d7938ab6020>, origin='/home/willayd/clones/pandas/pandas/__init__.py', submodule_search_locations=['/home/willayd/miniforge3/envs/pandas-dev/lib/python3.10/site-packages/_pandas_editable_loader.py/pandas'])

The problem is that the submodule_search_locations in meson-python 0.16.0 are non-existent, so when jinja2 tries to navigate those it ends up throwing an error like

pandas/tests/io/formats/style/test_exceptions.py:10: in <module>
    from pandas.io.formats.style import Styler
pandas/io/formats/style.py:43: in <module>
    from pandas.io.formats.style_render import (
pandas/io/formats/style_render.py:69: in <module>
    class StylerRenderer:
pandas/io/formats/style_render.py:74: in StylerRenderer
    loader = jinja2.PackageLoader("pandas", "io/formats/templates")
../../miniforge3/envs/pandas-dev/lib/python3.10/site-packages/jinja2/loaders.py:323: in __init__
    raise ValueError(
E   ValueError: The 'pandas' package was not installed in a way that PackageLoader understands.

WillAyd avatar Jan 09 '25 09:01 WillAyd

I asked it before and I'll ask it again: what is the reason for installing in editable mode in a CI job? It makes very little sense to me.

What jinja is doing here is not really kosher. There is no guarantee that submodule_search_locations is a list of filesystems paths. I haven't spend much time trying to understand what jinja is trying to do, but jinja itself knows that its approach is not generic as it has to special-case zip imports already: https://github.com/pallets/jinja/blob/6aeab5d1da0bc0793406d7b402693e779b6cca7a/src/jinja2/loaders.py#L331-L334

We need to have the path set to something recognizable that is not a filesystem path to support other use cases (brought forward by the Pandas developers). If jinja2.PackageLoader does not work for editable installations, and you want to support editable installations, you will have to switch to a different jinja template loader.

dnicolodi avatar Jan 09 '25 14:01 dnicolodi

The editable mode for CI in pandas is just a legacy thing. That can be removed. However, this issue does effect local editable installs as well

WillAyd avatar Jan 09 '25 14:01 WillAyd

jinja2.PackageLoader is not designed to work with module loaders that are not the standard filesystem loader. I don't know a way to reconcile it with editable installations. Any hint toward a possible solution is welcome.

dnicolodi avatar Jan 09 '25 15:01 dnicolodi

Hmm that's unfortunate. I am also not very familiar with the inner workings of jinja2

Since you mentioned that the pandas team asked for this submodule_search_locations addition, do you know what sparked that request? Maybe it was a misunderstanding on our end? I tried searching through the issues but didn't see anything

You definitely know better than me, but from a quick glance at PEP 451 it seems like the intention is for that field to contain a list of valid directories containing modules

WillAyd avatar Jan 09 '25 15:01 WillAyd

The documentation of jinja2.PackageLoader mentions that it does work only in limited cases:

Only packages installed as directories (standard pip behavior) or zip/egg files (less common) are supported. The Python API for introspecting data in packages is too limited to support other installation methods the way this loader requires.

dnicolodi avatar Jan 09 '25 15:01 dnicolodi

You definitely know better than me, but from a quick glance at PEP 451 it seems like the intention is for that field to contain a list of valid directories containing modules

The zip import mechanism in the standard library, uses submodule_search_locations in about the same way we use it. Thus I'm very sure that filesystems path are not the only intended use.

dnicolodi avatar Jan 09 '25 15:01 dnicolodi

Since you mentioned that the pandas team asked for this submodule_search_locations addition, do you know what sparked that request? Maybe it was a misunderstanding on our end? I tried searching through the issues but didn't see anything

I'm not sure the request was directly from Pandas developers. The trick with submodule_search_locations is required to support the use cases described in https://github.com/mesonbuild/meson-python/issues/557 and https://github.com/mesonbuild/meson-python/issues/568

dnicolodi avatar Jan 09 '25 15:01 dnicolodi

I can open an issue upstream to discuss in jinja if that will be helpful.

This is all a bit out of my wheelhouse, so just to clarify the editable installation of a meson-python project injects a custom loader that is responsible for resolving paths like pandas.io.formats.templates to something on disk right? And jinja2 is using its own rules for resolving that namespaces from the root of the pandas project, which isnt' working.

Is there a way for the custom loader to be queried by jinja to better resolve that path that I should suggest upstream?

WillAyd avatar Jan 09 '25 21:01 WillAyd

I don't think that opening an issue upstream is going to accomplish much: the jinja loader is documented to do not work with module loaders different than the standard filesystem loader and the zipimport loader. There is no API that would allow jinja to do what it does in a better way: the Python import system is not meant to be used that way.

However, the fake path is added to the load path only for namespace packages (directories installed as packages that do not contain a __init__.py file). Therefore, you may be able to solve the issue simply adding an empty __init__.py to the templates directory. I have not tried, but it may work.

dnicolodi avatar Jan 09 '25 21:01 dnicolodi

Unfortunately adding the __init__.py file does not seem to change the behavior at all

WillAyd avatar Jan 09 '25 22:01 WillAyd

Would it be possible for Pandas to switch to another loader? I'll have a look later to see if there is another workaround.

dnicolodi avatar Jan 10 '25 08:01 dnicolodi

I think I can use the FileSystemLoader. For our main code base that works well, as the template files are located in the same directory as the code loading them.

I noticed we do have a test case that loads templates from the pandas package, and we have talked about splitting off our tests into a different package. That hasn't happened yet, so I think the FileSystemLoader is definitely viable, but if we ever needed to resolve by package name in the future it would be more limiting than the standard PackageLoader I think

WillAyd avatar Jan 10 '25 14:01 WillAyd

You can construct the filesystem path starting from pandas.io.formats.__file__, or pandas.io.formats.templates.__file__ if you drop a __init__.py in the templates directory. The correct thing would be for jinja to have a ResourcesLoader using the importlib.resources API.

EDIT: I was surprised jinja does not have support for importlib.resources already, but indeed it does not. Searching for more information, I found this open issue https://github.com/pallets/jinja/issues/1978 with a PR implementing it https://github.com/pallets/jinja/pull/1985 This is apparently scheduled for jinja 3.2

dnicolodi avatar Jan 10 '25 14:01 dnicolodi

There isn't anything actionable here: editable installs work as advertised and the jinja2.PackageLoader has the limitations listed in the Jinja documentation. AFAICT the issue has been solved on the Pandas side. Closing.

dnicolodi avatar Oct 27 '25 17:10 dnicolodi