importlib_metadata
importlib_metadata copied to clipboard
Files for a src-style project installed in editable or develop mode
In GitLab by @sinoroc on Feb 23, 2020, 18:34
In the case of a project with a src
-style directory structure that is installed in develop or editable mode, it is not immediately possible to access the files.
.
├── setup.py
└── src
└── mypackage
└── __init__.py
#!/usr/bin/env python3
import setuptools
setuptools.setup(
name='MyProject',
version='0.0.0.dev0',
packages=setuptools.find_packages(where='src'),
package_dir={
'': 'src',
},
)
import importlib_metadata
print("importlib_metadata.__version__", importlib_metadata.__version__)
dist = importlib_metadata.distribution('MyProject')
print("dist.locate_file('')", dist.locate_file(''))
for file_ in dist.files:
print("file_", file_)
for file_ in dist.files:
print("file_", file_)
print(file_.read_text())
$ ./setup.py develop
running develop
running egg_info
creating src/MyProject.egg-info
writing src/MyProject.egg-info/PKG-INFO
writing dependency_links to src/MyProject.egg-info/dependency_links.txt
writing top-level names to src/MyProject.egg-info/top_level.txt
writing manifest file 'src/MyProject.egg-info/SOURCES.txt'
reading manifest file 'src/MyProject.egg-info/SOURCES.txt'
writing manifest file 'src/MyProject.egg-info/SOURCES.txt'
running build_ext
Creating /tmp/tmp.BzMk6xAF0S/.venv/lib/python3.6/site-packages/MyProject.egg-link (link to src)
MyProject 0.0.0.dev0 is already the active version in easy-install.pth
Installed /tmp/tmp.BzMk6xAF0S/src
Processing dependencies for MyProject==0.0.0.dev0
Finished processing dependencies for MyProject==0.0.0.dev0
$ python -c 'import mypackage'
importlib_metadata.__version__ 1.5.0
dist.locate_file('') /tmp/tmp.BzMk6xAF0S/src
file_ setup.py
file_ src/MyProject.egg-info/PKG-INFO
file_ src/MyProject.egg-info/SOURCES.txt
file_ src/MyProject.egg-info/dependency_links.txt
file_ src/MyProject.egg-info/top_level.txt
file_ src/mypackage/__init__.py
file_ setup.py
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/tmp.BzMk6xAF0S/src/mypackage/__init__.py", line 11, in <module>
print(file_.read_text())
File "/tmp/tmp.BzMk6xAF0S/.venv/lib/python3.6/site-packages/importlib_metadata/__init__.py", line 140, in read_text
with self.locate().open(encoding=encoding) as stream:
File "/usr/lib/python3.6/pathlib.py", line 1183, in open
opener=self._opener)
File "/usr/lib/python3.6/pathlib.py", line 1037, in _opener
return self._accessor.open(self, flags, mode)
File "/usr/lib/python3.6/pathlib.py", line 387, in wrapped
return strfunc(str(pathobj), *args)
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp.BzMk6xAF0S/src/setup.py'
This is obviously a special case, the combination of editable (or develop) mode with setuptools package_dir
seems quite tricky to deal with.
In GitLab by @blueyed on Mar 17, 2020, 12:34
mentioned in commit blueyed/importlib_metadata@5d73789b0619d5ac53dfb2966902c13b24d31e2b
In GitLab by @blueyed on Mar 17, 2020, 12:35
This is also a problem when not using "wheel". I've created https://gitlab.com/python-devs/importlib_metadata/-/merge_requests/114, which addresses this, but is only a "quick hack" / not finished, and would appreciate feedback over there.
Although some work was done on related issues, I believe this one remains. I welcome efforts to revive the effort or come up with a new approach.
This probably works (somehow) for pip install -e .
(when PEP 660 is used), right?
importlib.metadata
should be able to find the .dist-info
directory because it is added to sys.path
(Everything that depends on the RECORD
file will be weird, but that is a general problem with editable installs)
So, if we perform a modern editable installation with setuptools, importlib.metadata
will find the proper .dist-info
first (there will still be a src/*.egg-info
on the path, but it comes after):
# > docker run --rm -it python:3.12-bookworm /bin/bash
mkdir -p /tmp/proj && cd /tmp/proj
cat <<EOF > pyproject.toml
[build-system]
requires = ["setuptools>=74.1.2"]
build-backed = "setuptools.build_meta"
EOF
mkdir -p src tests
touch src/proj.py
python -m venv .venv
.venv/bin/python -m pip install -U importlib_metadata
.venv/bin/python -m pip install -e .
.venv/bin/python
>>> import importlib_metadata as metadata
>>> metadata.distribution("proj")._path
PosixPath('/tmp/proj/.venv/lib/python3.12/site-packages/proj-0.0.0.dist-info')
>>> [d._path for d in metadata.Distribution.discover(name="proj")]
[PosixPath('/tmp/proj/.venv/lib/python3.12/site-packages/proj-0.0.0.dist-info'), PosixPath('/tmp/proj/src/proj.egg-info')]
>>> dists = list(metadata.Distribution.discover(name="proj"))
>>> dists[0].files
[PackagePath('__editable__.proj-0.0.0.pth'), PackagePath('proj-0.0.0.dist-info/INSTALLER'), PackagePath('proj-0.0.0.dist-info/METADATA'), PackagePath('proj-0.0.0.dist-info/RECORD'), PackagePath('proj-0.0.0.dist-info/REQUESTED'), PackagePath('proj-0.0.0.dist-info/WHEEL'), PackagePath('proj-0.0.0.dist-info/direct_url.json'), PackagePath('proj-0.0.0.dist-info/top_level.txt')]
>>> dists[1].files
[]
As mentioned above the *.dist-info
folder resulting from an editable install will not have the proper .py
files on RECORD
(because it uses a .pth
):
# cat .venv/lib/python3.12/site-packages/proj-0.0.0.dist-info/RECORD
__editable__.proj-0.0.0.pth,sha256=XPp0xdLqXFR6kF2e4inRSr6TMwICwiKNaxmrwpJL7Mw,14
proj-0.0.0.dist-info/INSTALLER,sha256=zuuue4knoyJ-UwPPXg8fezS7VCrXJQrAP7zeNuwvFQg,4
proj-0.0.0.dist-info/METADATA,sha256=nZ_rXvwV8Ga5N5gGdUug8soyeMqt1_cO15vN1XN2Tfw,49
proj-0.0.0.dist-info/RECORD,,
proj-0.0.0.dist-info/REQUESTED,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
proj-0.0.0.dist-info/WHEEL,sha256=cVxcB9AmuTcXqmwrtPhNK88dr7IR_b6qagTj0UvIEbY,91
proj-0.0.0.dist-info/direct_url.json,sha256=ZZzJsymE1uuQ_YSuj329jxgglIcLOhr53kUF-Iik53g,59
proj-0.0.0.dist-info/top_level.txt,sha256=kUODLUKrmhiCRxf_flnhS_wRM9siiDohJ4iCL_SE8IM,5
On the other hand, the *.egg-info
folder will have add .py
files in SOURCES.txt
, however these are not going to be relative to the same parent directory as *.egg-info
but rather to the project root:
# cat src/proj.egg-info/SOURCES.txt
pyproject.toml
src/proj.py
src/proj.egg-info/PKG-INFO
src/proj.egg-info/SOURCES.txt
src/proj.egg-info/dependency_links.txt
So importlib-metadata
could use those to find the files, but it would have to be mindful regarding the directory which they are relative to. In other words, instead of making the files os.path.join(os.dirname(egg_info_path), egg_info_line)
, it would have to do os.path.join(os.dirname(egg_info_path), '..', egg_info_line)
Thanks for the added clarity.
Regarding the egg-info, I'd like to avoid adding more special cases for setuptools-specific behaviors. In fact, I'd really like to see egg-info generation removed.
I'm pleased to see that the PEP 660 behaviors are trending toward some reasonable behavior. Although files()
isn't going to return the list of sources in the project, it is going to return the list of files RECORDed by the wheel. And maybe that's good enough for editable installs. After all, an editable install can't possibly know what files are presented, as new files can be added without affecting the metadata. And that's the behavior users are going to see when using other backends.
Maybe what files()
should do is resolve Distribution.origin.url
(from PEP 610 direct_url.json) into all of the paths under that URL.
That will do the "right thing" by some accounts, essentially providing PackagePath
objects for every file in the source checkout. What it still doesn't do is resolve files in a src-layout or essential layout to the paths they would appear when installed. For example, setuptools-scm
uses a src-layout, but origin.url
points to the root of the repo instead of everything under src
:
>>> import importlib_metadata as md
>>> dist = md.distribution('setuptools-scm')
>>> dist.origin
namespace(dir_info=namespace(editable=True), url='file:///Users/jaraco/code/pypa/setuptools-scm')
Similarly, for essential layout, the origin.url
will point to the root of the repo and not recognize those files exist under some namespace.
I'm inclined to say that files()
is doing the best it can with what it has (presenting the files as given by the dist-info metadata), that it should deprecate support for egg-info files, and leave it at that until the packaging ecosystem provides a standard for resolving the "files" for an editable-installed package.
The broader issue is being tracked in https://github.com/pypa/packaging-problems/issues/620.