pip icon indicating copy to clipboard operation
pip copied to clipboard

pip doesn't support relative paths in direct URL references

Open Qix- opened this issue 6 years ago • 16 comments

Environment

  • pip version: 19.1.1
  • Python version: 3.7
  • OS: MacOS Mojave

Description I'm not sure if this is a six bug or a Pip bug. Excuse me if it belongs to six.

Pip seems to allow local paths in install_requires via name @ ./some/path, but the URL parsing is terribly broken.

https://github.com/pypa/pip/blob/a38a0eacd4650a7d9af4e6831a2c0aed0b6a0329/src/pip/_internal/download.py#L670-L690

In this function, it uses urlsplit to get the individual components of the incoming URL.

Here's what that looks like with a few pieces of input:

./foo/bar -> SplitResult(scheme='', netloc='', path='./foo/bar', query='', fragment='')
file:./foo/bar -> SplitResult(scheme='file', netloc='', path='./foo/bar', query='', fragment='')
file://./foo/bar -> SplitResult(scheme='file', netloc='.', path='/foo/bar', query='', fragment='')

Notice the last one results in the netloc being . instead of empty and the path being absolute, not local. This trips the error regarding non-local paths. That's all fine and well - I can use the second form to satisfy the conditional logic (though it really ought to support the first as well).

However, there's conflicting logic elsewhere...

https://github.com/pypa/pip/blob/169eebdb6e36a31e545804228cad94902a1ec8e9/src/pip/_vendor/packaging/requirements.py#L103-L106

This is the logic that fails even though we satisfy the prior logic.

Here's a test function that shows the problem:

from six.moves.urllib import parse as urllib_parse

def tryparse(url):
    print(url)
    parsed = urllib_parse.urlparse(url)
    unparsed = urllib_parse.urlunparse(parsed)
    parsed_again = urllib_parse.urlparse(unparsed)
    print(parsed)
    print(unparsed)
    print(parsed_again)

Here's the output for ./foo/bar:

>>> tryparse('./foo/bar')
./foo/bar
ParseResult(scheme='', netloc='', path='./foo/bar', params='', query='', fragment='')
./foo/bar
ParseResult(scheme='', netloc='', path='./foo/bar', params='', query='', fragment='')

All good, though it doesn't satisfy the first function's logic of requiring a scheme of file:.

Here's the output for file:./foo/bar:

>>> tryparse('file:./foo/bar')
file:./foo/bar
ParseResult(scheme='file', netloc='', path='./foo/bar', params='', query='', fragment='')
file:///./foo/bar
ParseResult(scheme='file', netloc='', path='/./foo/bar', params='', query='', fragment='')

Oops! Notice how, when we "unparse" the result from the first parse call, our path becomes absolute file:///....

This is why the second mentioned check fails - the path is not local. I believe this to be a bug in six but can be mitigated in Pip by allowing scheme in ['file', ''] and instructing users to use the ./foo/bar URI form.

Given these two contradictory pieces of logic, it's impossible to use local paths in install_requires keys in either distutils or setuptools configurations.

Expected behavior I should be able to do name @ ./some/path (or, honestly, simply ./some/path) to specify a vendored package local to my codebase.

How to Reproduce

#!/usr/bin/env bash
mkdir /tmp/pip-uri-repro && cd /tmp/pip-uri-repro

mkdir -p foo/bar

cat > requirements.txt <<EOF
./foo
EOF

cat > foo/setup.py <<EOF
#!/usr/bin/env python
from setuptools import setup
setup(
    name="foo",
    version="0.1",
    install_requires=[
        "bar @ file:./bar"
    ]
)
EOF

cat > foo/bar/setup.py <<EOF
#!/usr/bin/env python
from setuptools import setup
setup(
    name="bar",
    version="0.1"
)
EOF

# (OUTPUT 1)
pip install -r requirements.txt

cat > foo/setup.py <<EOF
#!/usr/bin/env python
from setuptools import setup
setup(
    name="foo",
    version="0.1",
    install_requires=[
        # we're forced to use an absolute path
        # to make the "Invalid URL" error go
        # away, which isn't right anyway (the
        # error that is raised as a result
        # is justified)
        "bar @ file://./bar"
    ]
)
EOF

# (OUTPUT 2)
pip install -r requirements.txt

Output

From the first pip install:

Processing ./foo
    ERROR: Complete output from command python setup.py egg_info:
    ERROR: error in foo setup command: 'install_requires' must be a string or list of strings containing valid project/version requirement specifiers; Invalid URL given

From the second pip install:

Processing ./foo
ERROR: Exception:
Traceback (most recent call last):
  File "/private/tmp/repro-pip-egg/env3/lib/python3.7/site-packages/pip/_internal/cli/base_command.py", line 178, in main
    status = self.run(options, args)
  File "/private/tmp/repro-pip-egg/env3/lib/python3.7/site-packages/pip/_internal/commands/install.py", line 352, in run
    resolver.resolve(requirement_set)
  File "/private/tmp/repro-pip-egg/env3/lib/python3.7/site-packages/pip/_internal/resolve.py", line 131, in resolve
    self._resolve_one(requirement_set, req)
  File "/private/tmp/repro-pip-egg/env3/lib/python3.7/site-packages/pip/_internal/resolve.py", line 294, in _resolve_one
    abstract_dist = self._get_abstract_dist_for(req_to_install)
  File "/private/tmp/repro-pip-egg/env3/lib/python3.7/site-packages/pip/_internal/resolve.py", line 242, in _get_abstract_dist_for
    self.require_hashes
  File "/private/tmp/repro-pip-egg/env3/lib/python3.7/site-packages/pip/_internal/operations/prepare.py", line 256, in prepare_linked_requirement
    path = url_to_path(req.link.url)
  File "/private/tmp/repro-pip-egg/env3/lib/python3.7/site-packages/pip/_internal/download.py", line 521, in url_to_path
    % url
ValueError: non-local file URIs are not supported on this platform: 'file://./bar'

EDIT:

Just found out that RFC 3986 specifies that relative path URIs are not permitted with the file: scheme, so technically six should be erroring out on file:./foo/bar.

However, that means, technically, I should be able to do the following in my setup.py:

PKG_DIR = os.path.dirname(os.path.abspath(__file__))
install_requires = [
    f"name @ file://{PKG_DIR}/foo/bar"
]

However, pip seems to be creating a "clean" copy of the package in /tmp, so we get something like file:///tmp/pip-req-build-9u3z545j/foo/bar.

Running that through our test function, we satisfy the second function's conditional:

>>> tryparse('file:///tmp/pip-req-build-9u3z545j/foo/bar')
file:///tmp/pip-req-build-9u3z545j/foo/bar
ParseResult(scheme='file', netloc='', path='/tmp/pip-req-build-9u3z545j/foo/bar', params='', query='', fragment='')
file:///tmp/pip-req-build-9u3z545j/foo/bar
ParseResult(scheme='file', netloc='', path='/tmp/pip-req-build-9u3z545j/foo/bar', params='', query='', fragment='')

Everything is good there. The "unparse" yields the same result, and the netloc requirements are met for the first function's conditional.

However, we're still met with an Invalid URL error, even though the second function's logic is satisfied.

Since pip (or distutils or setuptools or whatever) swallows output, I went ahead and did the following in my setup.py

import os
PKG_DIR = os.path.dirname(os.path.abspath(__file__))
assert False, os.system(f"find {PKG_DIR}")

Which verifies that all of the files are there, as expected - so it can't be a file missing or something. The line above that has "Invalid URL given" is the only place in the codebase that string shows up.

At this point, I'm not sure what the problem is.

Qix- avatar Jun 28 '19 18:06 Qix-

Okay, I see the problem. setuptools, pkg-resources and pip all use slightly different versions of the packaging library.

In pip, it's the version I showed above.

However, in everything else, it's the following (I'm not sure which is the "newer" one, but the following logic is very limiting and not fully compliant as per RFC 3986 as file:/// should be allowed, implying an empty netloc):

        if req.url:
            parsed_url = urlparse.urlparse(req.url)
            if not (parsed_url.scheme and parsed_url.netloc) or (
                    not parsed_url.scheme and not parsed_url.netloc):
                raise InvalidRequirement("Invalid URL given")

🙄

That means since my filepath has file:///foo/bar and not file://localhost/foo/bar then it fails.

Here is the complete solution:

import os
from setuptools import setup

PKG_DIR = os.path.dirname(os.path.abspath(__file__))

setup(
    install_requires=[
        f'foo @ file://localhost{PKG_DIR}/foo/bar'
    ]
)

This is pretty bad UX mixed in with ambiguous and time-wasting errors.

How can we improve this situation?

Qix- avatar Jun 28 '19 18:06 Qix-

@Qix- glad you found this! I was beating my head against the wall trying all the same formats. This is my alternative option to https://github.com/pypa/pip/issues/6162 and the deprecation of dependency_links.

We're trying to set up a private repo and don't have our own internal server. Our solution is to publish packages to s3 and then to consume them we download them, put them in a local folder, then add them to install_requires.

I'm sure there are many other uses cases that would benefit from an intuitive way to install local packages.

ryanaklein avatar Nov 26 '19 02:11 ryanaklein

@ryanaklein I would actually suggest ignoring all of the unresearch negativity towards git submodules and try them out (assuming you're using Git). If you stop thinking about them as branches and start thinking about them as tags (or releases), they begin to work really well. They're very frequently used in the C/C++ world, and we vendored Python packages using them pretty successfully (aside from the above bug, of course!).

Might cut down on the network/$$ costs of S3 :)

Qix- avatar Nov 26 '19 03:11 Qix-

Expected behavior I should be able to do name @ ./some/path (or, honestly, simply ./some/path) to specify a vendored package local to my codebase.

For the direct URL reference (name @ ./some/path) there are two places where work is happening:

  1. pypa/packaging#120 which tracks rejection of valid PEP 508 direct url references
  2. on the pip side we need to interpret relative paths relative to something, I started the discussion here to get feedback on what makes the most sense. We could track the eventual action here or in a new dedicated issue.

The latter wouldn't be acceptable per PEP 508, so it would be hard to justify supporting much less get it working across all tools.

This is pretty bad UX mixed in with ambiguous and time-wasting errors. How can we improve this situation?

#5204 should help with this general question from a pip standpoint.

chrahunt avatar Dec 08 '19 17:12 chrahunt

Oops! Notice how, when we "unparse" the result from the first parse call, our path becomes absolute file:///...

I think this is due to the CPython bug raised in issue 22852 – "urllib.parse wrongly strips empty #fragment, ?query, //netloc"

That bug seems to cause also issue #3783 – see this comment.

piotr-dobrogost avatar Dec 16 '19 14:12 piotr-dobrogost

What is the status of this issue? A solution for resolving local dependencies that are not on PyPI is urgently needed, for example in the context of monolithic repositories.

Note that npm has implemented this feature in a similar way and dependencies can be specified in package.json using a local path.

hackermd avatar May 15 '20 16:05 hackermd

Please see this comment above to know what needs to happen before this has a chance to be implmenented. pip can do nothing before that.

uranusjr avatar May 15 '20 16:05 uranusjr

FYI Hatch now supports relative paths using the new context formatting capability https://hatch.pypa.io/latest/config/dependency/#local

ofek avatar May 23 '22 03:05 ofek

Hello, Is there any progress in having relative references? I see more and more people are dropping pip in favor of other tools, these do support relative links and more. 6 years passed... Thanks,

alonbl avatar Oct 26 '25 21:10 alonbl

No. If you're interested in seeing it happen, we would welcome a PR!

ichard26 avatar Oct 26 '25 21:10 ichard26

Actually, looking at the linked DPO discussion, it seems like the consensus was against permitting relative paths for Direct URL requirements in Install-Requires metadata. I'm not sure if there was further follow-up discussion anywhere else, but the resolution seems to be no action.

There is some other commentary on pip's URL parsing/handling in the original description that I'd like to check, but I'll do that at a later date.

ichard26 avatar Oct 26 '25 21:10 ichard26

No. If you're interested in seeing it happen, we would welcome a PR!

I will be very happy, however, I do not see the consensus of what acceptable solution is. I would go with relative file:../xxx but I also opened to environment variable expansion of pop when using file://${ENV}/xxx. Another option is to have --dependency-root= argument to pip and have all file:/// relative to this directory.

What will you accept?

There is no such thing as absolute paths in filesystems, everything is relative to something.

alonbl avatar Oct 26 '25 21:10 alonbl

It turns out that pip does actually tolerate relative paths in Direct URL requirements in Install-Requires metadata.

Installing a package with this setup.py totally works as long as the current directory is the right package.

from setuptools import setup

setup(
    name="version_pkg",
    version="0.1",
    packages=find_packages(),
    py_modules=["version_pkg"],
    install_requires=["pip @ file:."],
)

However, the pip project strives to implement standards-backed behaviour and features. Relative paths in Direct URL requirements would be a pip-specific extension to the formal dependency specifier standard(s). Given that the desired behaviour can be achieved using --find-links or vendoring, I will say that you should not rely on this feature given it's undocumented and a quirk of lax requirement checks. Anyway, I'd imagine pip's current behaviour using the CWD as the root is probably suboptimal.

If other tools' extensions to the Direct URL reference syntax prove popular, a PEP can be proposed to standardize relative paths, which if accepted, pip can adopt it afterwards.

ichard26 avatar Oct 26 '25 22:10 ichard26

Thanks, I am working with pyproject.toml and would like a solution that may be used in this context as well, and not as a hack.

What about the hatch[1] solution to expand specific variables?

And if syntax should be intact then I would like to suggest the --dependency-root= additional argument as a temporary solution until such PIP will be defined, as in current situation the local dependency is unusable in any real-world scenario.

[1] https://hatch.pypa.io/latest/config/dependency/#local

alonbl avatar Oct 26 '25 22:10 alonbl

It is very annoying as tox does support relative links, I cannot reach to a situation in which both tox and pip work within the same configuration.

alonbl avatar Oct 26 '25 22:10 alonbl

I know this is a disappointing answer, but as I said before, we aren't going to be adding a custom extension to standardized features. --dependency-root may technically not be an extension, but in practice it is.

ichard26 avatar Oct 26 '25 22:10 ichard26