meson-python editable install method is not compatible with `jinja2.PackageLoader`
I noticed when trying to upgrade the pandas CI to use more recent versions of meson-python that I started to see jinja2 failures in our test suite. As far as I can tell, jinja2 resolves paths to the package by loading the module specification via calls to importlib.util.find_spec("pandas")
In meson-python < 0.16.0, the debugger shows that specification as:
ModuleSpec(name='pandas', loader=<_pandas_editable_loader.SourceFileLoader object at 0x7ede6cb92e60>, origin='/home/willayd/clones/pandas/pandas/__init__.py', submodule_search_locations=[])
With meson-python 0.16.0, that same specification now shows as:
ModuleSpec(name='pandas', loader=<_pandas_editable_loader.SourceFileLoader object at 0x7d7938ab6020>, origin='/home/willayd/clones/pandas/pandas/__init__.py', submodule_search_locations=['/home/willayd/miniforge3/envs/pandas-dev/lib/python3.10/site-packages/_pandas_editable_loader.py/pandas'])
The problem is that the submodule_search_locations in meson-python 0.16.0 are non-existent, so when jinja2 tries to navigate those it ends up throwing an error like
pandas/tests/io/formats/style/test_exceptions.py:10: in <module>
from pandas.io.formats.style import Styler
pandas/io/formats/style.py:43: in <module>
from pandas.io.formats.style_render import (
pandas/io/formats/style_render.py:69: in <module>
class StylerRenderer:
pandas/io/formats/style_render.py:74: in StylerRenderer
loader = jinja2.PackageLoader("pandas", "io/formats/templates")
../../miniforge3/envs/pandas-dev/lib/python3.10/site-packages/jinja2/loaders.py:323: in __init__
raise ValueError(
E ValueError: The 'pandas' package was not installed in a way that PackageLoader understands.
I asked it before and I'll ask it again: what is the reason for installing in editable mode in a CI job? It makes very little sense to me.
What jinja is doing here is not really kosher. There is no guarantee that submodule_search_locations is a list of filesystems paths. I haven't spend much time trying to understand what jinja is trying to do, but jinja itself knows that its approach is not generic as it has to special-case zip imports already: https://github.com/pallets/jinja/blob/6aeab5d1da0bc0793406d7b402693e779b6cca7a/src/jinja2/loaders.py#L331-L334
We need to have the path set to something recognizable that is not a filesystem path to support other use cases (brought forward by the Pandas developers). If jinja2.PackageLoader does not work for editable installations, and you want to support editable installations, you will have to switch to a different jinja template loader.
The editable mode for CI in pandas is just a legacy thing. That can be removed. However, this issue does effect local editable installs as well
jinja2.PackageLoader is not designed to work with module loaders that are not the standard filesystem loader. I don't know a way to reconcile it with editable installations. Any hint toward a possible solution is welcome.
Hmm that's unfortunate. I am also not very familiar with the inner workings of jinja2
Since you mentioned that the pandas team asked for this submodule_search_locations addition, do you know what sparked that request? Maybe it was a misunderstanding on our end? I tried searching through the issues but didn't see anything
You definitely know better than me, but from a quick glance at PEP 451 it seems like the intention is for that field to contain a list of valid directories containing modules
The documentation of jinja2.PackageLoader mentions that it does work only in limited cases:
Only packages installed as directories (standard pip behavior) or zip/egg files (less common) are supported. The Python API for introspecting data in packages is too limited to support other installation methods the way this loader requires.
You definitely know better than me, but from a quick glance at PEP 451 it seems like the intention is for that field to contain a list of valid directories containing modules
The zip import mechanism in the standard library, uses submodule_search_locations in about the same way we use it. Thus I'm very sure that filesystems path are not the only intended use.
Since you mentioned that the pandas team asked for this
submodule_search_locationsaddition, do you know what sparked that request? Maybe it was a misunderstanding on our end? I tried searching through the issues but didn't see anything
I'm not sure the request was directly from Pandas developers. The trick with submodule_search_locations is required to support the use cases described in https://github.com/mesonbuild/meson-python/issues/557 and https://github.com/mesonbuild/meson-python/issues/568
I can open an issue upstream to discuss in jinja if that will be helpful.
This is all a bit out of my wheelhouse, so just to clarify the editable installation of a meson-python project injects a custom loader that is responsible for resolving paths like pandas.io.formats.templates to something on disk right? And jinja2 is using its own rules for resolving that namespaces from the root of the pandas project, which isnt' working.
Is there a way for the custom loader to be queried by jinja to better resolve that path that I should suggest upstream?
I don't think that opening an issue upstream is going to accomplish much: the jinja loader is documented to do not work with module loaders different than the standard filesystem loader and the zipimport loader. There is no API that would allow jinja to do what it does in a better way: the Python import system is not meant to be used that way.
However, the fake path is added to the load path only for namespace packages (directories installed as packages that do not contain a __init__.py file). Therefore, you may be able to solve the issue simply adding an empty __init__.py to the templates directory. I have not tried, but it may work.
Unfortunately adding the __init__.py file does not seem to change the behavior at all
Would it be possible for Pandas to switch to another loader? I'll have a look later to see if there is another workaround.
I think I can use the FileSystemLoader. For our main code base that works well, as the template files are located in the same directory as the code loading them.
I noticed we do have a test case that loads templates from the pandas package, and we have talked about splitting off our tests into a different package. That hasn't happened yet, so I think the FileSystemLoader is definitely viable, but if we ever needed to resolve by package name in the future it would be more limiting than the standard PackageLoader I think
You can construct the filesystem path starting from pandas.io.formats.__file__, or pandas.io.formats.templates.__file__ if you drop a __init__.py in the templates directory. The correct thing would be for jinja to have a ResourcesLoader using the importlib.resources API.
EDIT: I was surprised jinja does not have support for importlib.resources already, but indeed it does not. Searching for more information, I found this open issue https://github.com/pallets/jinja/issues/1978 with a PR implementing it https://github.com/pallets/jinja/pull/1985 This is apparently scheduled for jinja 3.2
There isn't anything actionable here: editable installs work as advertised and the jinja2.PackageLoader has the limitations listed in the Jinja documentation. AFAICT the issue has been solved on the Pandas side. Closing.