pysolr icon indicating copy to clipboard operation
pysolr copied to clipboard

Support for customized Python installs which do not include setuptools

Open nchepanov opened this issue 4 years ago • 6 comments

I have

  • [x] Tested with the latest release
  • [x] Tested with the current master branch
  • [x] Searched for similar existing issues

Expected behaviour

https://github.com/django-haystack/pysolr/blob/76c77e33d56dc22b5dc92c2434f7b4a6040f9ef1/pysolr.py#L14

The project uses pkg_resources provided by setuptools, and it is making the assumption that setuptools is universally available in all environments. While this is true for default-constructed venv / virtualenv, it's not true in some cases where the final deployable artifact removes files not declared as required, for example an Application can choose to purge pip, wheel, setuptools from a container to reduce its size (unless the application directly uses and declares dependency on any of the aforementioned libraries).

Python libraries that directly use modules provided by setuptools must declare setuptools as library runtime dependency via install_requires.

Actual behaviour

Steps to reproduce the behaviour

python3.9 -m venv  venv
source venv/bin/activate
(venv) pip install pysolr
Successfully installed certifi-2021.5.30 chardet-4.0.0 idna-2.10 pysolr-3.9.0 requests-2.25.1 urllib3-1.26.6
(venv) python -c "import pysolr" # works
(venv) pip uninstall setuptools
(venv) python -c "import pysolr" # ModuleNotFoundError
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/dev/pysolr-issue/venv/lib/python3.9/site-packages/pysolr.py", line 14, in <module>
    from pkg_resources import DistributionNotFound, get_distribution, parse_version
ModuleNotFoundError: No module named 'pkg_resources'

Configuration

  • Operating system version: Any
  • Search engine version: N.A.
  • Python version: Any
  • pysolr version: latest (any?)

nchepanov avatar Jun 30 '21 19:06 nchepanov

Additionally, pkg_resources considered deprecated and can be replaced with importlib_metadata (or Python stdlib importlib.metadata for Python 3.8+)

>>> from importlib_metadata import version
>>> version("pysolr")
'3.9.0'

nchepanov avatar Jun 30 '21 20:06 nchepanov

In general we only support standard Python environments but if you want to send a pull-request defaulting to using importlib.metadata that could be useful. Since importlib_metadata is an official back-port it should be relatively straightforward to add it as a conditional install for Python 3.7 (which is still supported until 2023) and earlier using environment markers.

acdha avatar Jul 06 '21 16:07 acdha

In general we only support standard Python environments

I'm not sure what a standard Python environment is, I just want to point out that pysolr Python package has a runtime dependency on non-stdlib library that must be declared in setup.py::install_requires. The library is setuptools and the module that the library provides is pkg_resources.

defaulting to using importlib.metadata that could be useful

I can help with that, the only problem is pysolr.version_info:

https://github.com/django-haystack/pysolr/blob/76c77e33d56dc22b5dc92c2434f7b4a6040f9ef1/pysolr.py#L70

It exposes an instance of pkg_resources._vendor.packaging.version.Version that's not part of importlib_metadata. It is instead part of https://packaging.pypa.io/en/latest/version.html

There are a few options:

  • remove pysolr.version_info from the public API
  • implement fake Version class for backwards compatibility
  • or if you want to keep this in the public API, then additional unconditional dependency on https://pypi.org/project/packaging/ will need to be added

nchepanov avatar Jul 06 '21 17:07 nchepanov

I'm not sure what a standard Python environment is, I just want to point out that pysolr Python package has a runtime dependency on non-stdlib library that must be declared in setup.py::install_requires. The library is setuptools and the module that the library provides is pkg_resources.

As far as I can tell, the only way this situation arises is when someone removes setuptools as part of their install process. That's not a common situation so it's not surprising that most projects don't have first-class support for it.

I can help with that, the only problem is pysolr.version_info:

version_info was added about 5 years ago when __version__ was changed to be a string, with version_info holding what had been in __version__ until that point. I suspect it's rarely used if ever but we probably either want to ship this in a breaking release or do something like emit a deprecation warning. Since we've never supported running without setuptools / pkg_resources before one option would simply be to tolerate an ImportError with a DeprecationWarning that version_info will go away in a future release since any existing project must already have those libraries installed.

acdha avatar Jul 06 '21 17:07 acdha

As far as I can tell, the only way this situation arises is when someone removes setuptools as part of their install process. That's not a common situation so it's not surprising that most projects don't have first-class support for it.

This is not the case. Increased adoption of containerised workflows means separation between build-time dependencies and runtime dependencies.

My team uses a build system that uses setuptools to build the application, but the produced package only contains the requirements explicitly stated in setup.py (and transitives).

This means that we have to add setuptools to our runtime dependencies because pysolr doesn't do it, or it won't be installed in the container. I think depending on non-stdlib packages that you also don't specify as dependencies is generally a bad idea

Lewiky avatar May 04 '22 14:05 Lewiky

This is not the case. Increased adoption of containerised workflows means separation between build-time dependencies and runtime dependencies.

Yes, those tools are becoming more common but that's a recent development and still a minority of installations. I don't see what you intend to gain from being argumentative here but since Python 3's stdlib now includes importlib I would think that the best path forward is to adopt that with the fallback to importlib_metadata since Python 3.7's EOL is still over a year away at this point. If you wanted to send a merge request to better support your team's environment, I think that would be a great contribution and we can ship it as a major release just in case there's anyone in the world still using the older version_info attribute.

acdha avatar May 04 '22 15:05 acdha