importlib_metadata icon indicating copy to clipboard operation
importlib_metadata copied to clipboard

Files for a src-style project installed in editable or develop mode

Open jaraco opened this issue 4 years ago • 5 comments

In GitLab by @sinoroc on Feb 23, 2020, 18:34

In the case of a project with a src-style directory structure that is installed in develop or editable mode, it is not immediately possible to access the files.

.
├── setup.py
└── src
    └── mypackage
        └── __init__.py
#!/usr/bin/env python3

import setuptools

setuptools.setup(
    name='MyProject',
    version='0.0.0.dev0',
    packages=setuptools.find_packages(where='src'),
    package_dir={
        '': 'src',
    },
)
import importlib_metadata

print("importlib_metadata.__version__", importlib_metadata.__version__)

dist = importlib_metadata.distribution('MyProject')
print("dist.locate_file('')", dist.locate_file(''))
for file_ in dist.files:
    print("file_", file_)
for file_ in dist.files:
    print("file_", file_)
    print(file_.read_text())
$ ./setup.py develop
running develop
running egg_info
creating src/MyProject.egg-info
writing src/MyProject.egg-info/PKG-INFO
writing dependency_links to src/MyProject.egg-info/dependency_links.txt
writing top-level names to src/MyProject.egg-info/top_level.txt
writing manifest file 'src/MyProject.egg-info/SOURCES.txt'
reading manifest file 'src/MyProject.egg-info/SOURCES.txt'
writing manifest file 'src/MyProject.egg-info/SOURCES.txt'
running build_ext
Creating /tmp/tmp.BzMk6xAF0S/.venv/lib/python3.6/site-packages/MyProject.egg-link (link to src)
MyProject 0.0.0.dev0 is already the active version in easy-install.pth

Installed /tmp/tmp.BzMk6xAF0S/src
Processing dependencies for MyProject==0.0.0.dev0
Finished processing dependencies for MyProject==0.0.0.dev0
$ python -c 'import mypackage'
importlib_metadata.__version__ 1.5.0
dist.locate_file('') /tmp/tmp.BzMk6xAF0S/src
file_ setup.py
file_ src/MyProject.egg-info/PKG-INFO
file_ src/MyProject.egg-info/SOURCES.txt
file_ src/MyProject.egg-info/dependency_links.txt
file_ src/MyProject.egg-info/top_level.txt
file_ src/mypackage/__init__.py
file_ setup.py
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/tmp/tmp.BzMk6xAF0S/src/mypackage/__init__.py", line 11, in <module>
    print(file_.read_text())
  File "/tmp/tmp.BzMk6xAF0S/.venv/lib/python3.6/site-packages/importlib_metadata/__init__.py", line 140, in read_text
    with self.locate().open(encoding=encoding) as stream:
  File "/usr/lib/python3.6/pathlib.py", line 1183, in open
    opener=self._opener)
  File "/usr/lib/python3.6/pathlib.py", line 1037, in _opener
    return self._accessor.open(self, flags, mode)
  File "/usr/lib/python3.6/pathlib.py", line 387, in wrapped
    return strfunc(str(pathobj), *args)
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp.BzMk6xAF0S/src/setup.py'

This is obviously a special case, the combination of editable (or develop) mode with setuptools package_dir seems quite tricky to deal with.

jaraco avatar Oct 22 '20 21:10 jaraco

In GitLab by @blueyed on Mar 17, 2020, 12:34

mentioned in commit blueyed/importlib_metadata@5d73789b0619d5ac53dfb2966902c13b24d31e2b

jaraco avatar Oct 22 '20 21:10 jaraco

In GitLab by @blueyed on Mar 17, 2020, 12:34

mentioned in merge request !114

jaraco avatar Oct 22 '20 21:10 jaraco

In GitLab by @blueyed on Mar 17, 2020, 12:35

This is also a problem when not using "wheel". I've created https://gitlab.com/python-devs/importlib_metadata/-/merge_requests/114, which addresses this, but is only a "quick hack" / not finished, and would appreciate feedback over there.

jaraco avatar Oct 22 '20 21:10 jaraco

Although some work was done on related issues, I believe this one remains. I welcome efforts to revive the effort or come up with a new approach.

jaraco avatar Jan 10 '21 20:01 jaraco

This probably works (somehow) for pip install -e . (when PEP 660 is used), right? importlib.metadata should be able to find the .dist-info directory because it is added to sys.path

(Everything that depends on the RECORD file will be weird, but that is a general problem with editable installs)

abravalheri avatar May 02 '24 08:05 abravalheri

So, if we perform a modern editable installation with setuptools, importlib.metadata will find the proper .dist-info first (there will still be a src/*.egg-info on the path, but it comes after):

# > docker run --rm -it python:3.12-bookworm /bin/bash

mkdir -p /tmp/proj && cd /tmp/proj

cat <<EOF > pyproject.toml
[build-system]
requires = ["setuptools>=74.1.2"]
build-backed = "setuptools.build_meta"
EOF

mkdir -p src tests
touch src/proj.py

python -m venv .venv
.venv/bin/python -m pip install -U importlib_metadata
.venv/bin/python -m pip install -e .
.venv/bin/python
>>> import importlib_metadata as metadata
>>> metadata.distribution("proj")._path
PosixPath('/tmp/proj/.venv/lib/python3.12/site-packages/proj-0.0.0.dist-info')
>>> [d._path for d in metadata.Distribution.discover(name="proj")]
[PosixPath('/tmp/proj/.venv/lib/python3.12/site-packages/proj-0.0.0.dist-info'), PosixPath('/tmp/proj/src/proj.egg-info')]
>>> dists = list(metadata.Distribution.discover(name="proj"))
>>> dists[0].files
[PackagePath('__editable__.proj-0.0.0.pth'), PackagePath('proj-0.0.0.dist-info/INSTALLER'), PackagePath('proj-0.0.0.dist-info/METADATA'), PackagePath('proj-0.0.0.dist-info/RECORD'), PackagePath('proj-0.0.0.dist-info/REQUESTED'), PackagePath('proj-0.0.0.dist-info/WHEEL'), PackagePath('proj-0.0.0.dist-info/direct_url.json'), PackagePath('proj-0.0.0.dist-info/top_level.txt')]
>>> dists[1].files
[]

As mentioned above the *.dist-info folder resulting from an editable install will not have the proper .py files on RECORD (because it uses a .pth):

# cat .venv/lib/python3.12/site-packages/proj-0.0.0.dist-info/RECORD
__editable__.proj-0.0.0.pth,sha256=XPp0xdLqXFR6kF2e4inRSr6TMwICwiKNaxmrwpJL7Mw,14
proj-0.0.0.dist-info/INSTALLER,sha256=zuuue4knoyJ-UwPPXg8fezS7VCrXJQrAP7zeNuwvFQg,4
proj-0.0.0.dist-info/METADATA,sha256=nZ_rXvwV8Ga5N5gGdUug8soyeMqt1_cO15vN1XN2Tfw,49
proj-0.0.0.dist-info/RECORD,,
proj-0.0.0.dist-info/REQUESTED,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
proj-0.0.0.dist-info/WHEEL,sha256=cVxcB9AmuTcXqmwrtPhNK88dr7IR_b6qagTj0UvIEbY,91
proj-0.0.0.dist-info/direct_url.json,sha256=ZZzJsymE1uuQ_YSuj329jxgglIcLOhr53kUF-Iik53g,59
proj-0.0.0.dist-info/top_level.txt,sha256=kUODLUKrmhiCRxf_flnhS_wRM9siiDohJ4iCL_SE8IM,5

On the other hand, the *.egg-info folder will have add .py files in SOURCES.txt, however these are not going to be relative to the same parent directory as *.egg-info but rather to the project root:

# cat src/proj.egg-info/SOURCES.txt
pyproject.toml
src/proj.py
src/proj.egg-info/PKG-INFO
src/proj.egg-info/SOURCES.txt
src/proj.egg-info/dependency_links.txt

So importlib-metadata could use those to find the files, but it would have to be mindful regarding the directory which they are relative to. In other words, instead of making the files os.path.join(os.dirname(egg_info_path), egg_info_line), it would have to do os.path.join(os.dirname(egg_info_path), '..', egg_info_line)

abravalheri avatar Sep 10 '24 11:09 abravalheri

Thanks for the added clarity.

Regarding the egg-info, I'd like to avoid adding more special cases for setuptools-specific behaviors. In fact, I'd really like to see egg-info generation removed.

I'm pleased to see that the PEP 660 behaviors are trending toward some reasonable behavior. Although files() isn't going to return the list of sources in the project, it is going to return the list of files RECORDed by the wheel. And maybe that's good enough for editable installs. After all, an editable install can't possibly know what files are presented, as new files can be added without affecting the metadata. And that's the behavior users are going to see when using other backends.

Maybe what files() should do is resolve Distribution.origin.url (from PEP 610 direct_url.json) into all of the paths under that URL.

That will do the "right thing" by some accounts, essentially providing PackagePath objects for every file in the source checkout. What it still doesn't do is resolve files in a src-layout or essential layout to the paths they would appear when installed. For example, setuptools-scm uses a src-layout, but origin.url points to the root of the repo instead of everything under src:

>>> import importlib_metadata as md
>>> dist = md.distribution('setuptools-scm')
>>> dist.origin
namespace(dir_info=namespace(editable=True), url='file:///Users/jaraco/code/pypa/setuptools-scm')

Similarly, for essential layout, the origin.url will point to the root of the repo and not recognize those files exist under some namespace.

I'm inclined to say that files() is doing the best it can with what it has (presenting the files as given by the dist-info metadata), that it should deprecate support for egg-info files, and leave it at that until the packaging ecosystem provides a standard for resolving the "files" for an editable-installed package.

jaraco avatar Sep 11 '24 12:09 jaraco

The broader issue is being tracked in https://github.com/pypa/packaging-problems/issues/620.

jaraco avatar Sep 11 '24 13:09 jaraco