pip-tools icon indicating copy to clipboard operation
pip-tools copied to clipboard

future: add commit to the compiled output again (pinning)

Open blueyed opened this issue 9 years ago • 14 comments

When using a requirement like -e git+git://github.com/jezdez/django-configurations.git#egg=django-configurations (which is required currently as per https://github.com/nvie/pip-tools/issues/155), the compiled output will be the same.

But it should be the current commit / hash at that branch, just like pip freeze shows it:

-e [email protected]:jezdez/django-configurations.git@5ece1070448d3e026b81422691c60207f5699c0c#egg=django_configurations-master

This is a regression after merging the massive-rewrite branch.

blueyed avatar May 12 '15 23:05 blueyed

It looks like this could be done through from pip.vcs import get_src_requirement:

def get_src_requirement(self, dist, location, find_tags=False):
    """
    Return a string representing the requirement needed to
    redownload the files currently present in location, something
    like:
      {repository_url}@{revision}#egg={project_name}-{version_identifier}
    If find_tags is True, try to find a tag matching the revision
    """

blueyed avatar Jun 16 '15 12:06 blueyed

I have this working now!

PR will follow later.

blueyed avatar Jun 16 '15 23:06 blueyed

Take a look at: https://developer.github.com/changes/2016-02-24-commit-reference-sha-api/

cancan101 avatar Mar 08 '16 16:03 cancan101

That's not really related and for Github only, isn't it?!

btw: the PR I was referring to be made has been closed/rejected in the meantime: https://github.com/nvie/pip-tools/pull/169

blueyed avatar Mar 08 '16 17:03 blueyed

+1 for VCS pinning

georgexsh avatar Apr 20 '16 05:04 georgexsh

@blueyed You mentioned you have this working, can you submit the PR in order to move this issue along?

tysonclugg avatar May 02 '18 11:05 tysonclugg

@tysonclugg Without looking at it again that would have been https://github.com/jazzband/pip-tools/pull/169.

blueyed avatar May 02 '18 12:05 blueyed

@vphilippon Would you consider a version of #169 updated again for the current master? Or at least the part the resolves the commit for editable VCS URLs?

taion avatar Jun 21 '18 23:06 taion

Any guidance on how this should be implemented? Seems the maintainer maybe changed since the last PR was rejected, but last time was rejected because it was thought work was needed in pip?

jjlee avatar May 01 '20 20:05 jjlee

I am not confident this is a good idea, but here it is:

Inside _build_direct_reference_best_efforts, we could check if ireq.link.url_without_fragment is a vcs URL that does not already end in @<tag-and-or-hash>, and if so, do the equivalent of

$ git ls-remote <url_without_fragment_and_vcs_scheme_stripped> HEAD

to determine a hash, then insert it.

It would need to support all pip-supported VCS aside from git, obviously, and I don't know how/if to get ls-remote-like functionality out of pip's internal vcs utilities, or if that's even what we want to use.

EDIT: Oh and that doesn't address determining hashes for local repos either.

AndydeCleyre avatar Sep 22 '21 19:09 AndydeCleyre

Maybe better to handle this in piptools/_compat/pip_compat.py:parse_requirements?

AndydeCleyre avatar Sep 22 '21 19:09 AndydeCleyre

Edited to combine two comments and to share some generic thoughts up-front.

There are maybe at least two ways to use git-based dependencies:

  1. don't pin them, in which case they "implicitly pin" to main (which is a moving target)
  2. pin them, in which case you may need to pin all of them throughout your dep tree (i.e., in the install_requires of setup.py)

In world (1), the current behavior of pip-compile is arguably reasonable behavior (that's a subjective opinion from me and not fully thought out necessarily), since in this world all git-based dependencies are always "implicitly pinned" to main and old combinations of package versions are unsupported (if package A depends on package B@main, then once a new commit is made to the main branch of B you can never reproduce that environment again; if pip-compile did record B@<commit_hash> in requirements.txt, this would actually not satisfy the requirement B@main in package A's install_requires after a new commit is made to the main branch of B). This world is reasonable in the scenario where we aim to make all the latest versions of all git-based packages compatible with each other, but we don't support reproducing environments (since main is likely to have changed). There is an underlying question here of whether it makes sense to use pip-compile (or requirements.txt at all) if reproducing envs is not of interest; note that pip freeze would record B@<commit_hash> and after a new commit to B's main branch pip install -r requirements.txt would no longer succeed either.

In world (2), if we pin all git-based deps throughout the dep tree (i.e., in the install_requires of setup.py) then it would make sense to also pin them in requirements.txt. In this world, pip freeze creates reproducible/future-proof requirements.txt files since B@<commit_hash> will match B@<pinned_version> in A's install_requires even if a new commit is made to the main branch of B. In the context of the current issue, if we write an abstract git-based dep on A in requirements.in, pip-compile could fill in the hash of the current latest version for us (this is part of the typical expected behavior of pip-compile). A workaround is to manually pin the versions of top-level/direct git-based deps in requirements.in (presumably the indirect git-based deps will get pins in requirements.txt because the top-level install_requires should include a pin for all git-based deps).

Another workaround within world (2) that I think achieves a similar effect is to do

pip-compile requirements.in -o requirements_partially_frozen.txt
pip install -r requirements_partially_frozen.txt
pip freeze > requirements_fully_frozen.txt

or if we don't care to track why a dep was required

pip install -r requirements.in
pip freeze > requirements.txt

but if there are relatively few top-level/direct git-based deps it may be less work to just pin them in requirements.in.

thomasgilgenast avatar Aug 11 '23 15:08 thomasgilgenast

This is still my # 1 feature request and probably not only mine. There's really no convenient way to keep a repo dependency updated after initially pinning it. So most people just don't pin, and stuff can just break any time.


pip-compile has come a long way and I feel like a new pull request such as #169 would fall into place perfectly. (Though that's just my uneducated impression; diving into the code seemed daunting to me at the moment.)

In the past, even recognizing (non-editable) git URLs and just passing them through was an achievement. Now the repository is fully resolved, with its dependencies added.

In the past, pip complained about a git URL with a commit hash (only wanted a branch). Now that just works seamlessly.


To clarify again, the request is:

  • Input: pip-tools @ git+https://github.com/jazzband/pip-tools.git@main
  • Current output: pip-tools @ git+https://github.com/jazzband/pip-tools.git@main
  • Desired output: pip-tools @ git+https://github.com/jazzband/pip-tools.git@f20d6ae0af45cfeb54e69d7ead9a9d09449b8d5c

oprypin avatar Nov 20 '23 14:11 oprypin

I hope someone can help with a better way, but I just tried this hack locally with good results:

diff --git a/piptools/utils.py b/piptools/utils.py
index 62cb26a..f7661dc 100644
--- a/piptools/utils.py
+++ b/piptools/utils.py
@@ -8,6 +8,7 @@ import json
 import os
 import re
 import shlex
+import subprocess  # nosec
 import sys
 from pathlib import Path
 from typing import Any, Callable, Iterable, Iterator, TypeVar, cast
@@ -166,7 +167,19 @@ def _build_direct_reference_best_efforts(ireq: InstallRequirement) -> str:
     # We need to remove the egg if it exists and keep the rest of the fragments.
     lowered_ireq_name = canonicalize_name(ireq.name)
     extras = f"[{','.join(sorted(ireq.extras))}]" if ireq.extras else ""
-    direct_reference = f"{lowered_ireq_name}{extras} @ {ireq.link.url_without_fragment}"
+
+    url = ireq.link.url_without_fragment
+    m = re.match(r"git\+(.*)@([^/]+)$", url)
+    if m:
+        url, branch = m.groups()
+        branch = (
+            subprocess.check_output(["git", "ls-remote", url, branch])  # nosec
+            .split()[0]
+            .decode("utf-8")
+        )
+        url = f"git+{url}@{branch}"
+
+    direct_reference = f"{lowered_ireq_name}{extras} @ {url}"
     fragments = []
 
     # Check if there is any fragment to add to the URI.

EDIT: Much more hacky, some more cases "handled:"

diff --git a/piptools/utils.py b/piptools/utils.py
index de4399b..0ba0f99 100644
--- a/piptools/utils.py
+++ b/piptools/utils.py
@@ -8,6 +8,7 @@ import json
 import os
 import re
 import shlex
+import subprocess  # nosec
 import sys
 from pathlib import Path
 from typing import Any, Callable, Iterable, Iterator, TypeVar, cast
@@ -19,6 +20,8 @@ if sys.version_info >= (3, 11):
 else:
     import tomli as tomllib
 
+from contextlib import suppress
+
 import click
 from click.utils import LazyFile
 from pip._internal.req import InstallRequirement
@@ -166,7 +169,37 @@ def _build_direct_reference_best_efforts(ireq: InstallRequirement) -> str:
     # We need to remove the egg if it exists and keep the rest of the fragments.
     lowered_ireq_name = canonicalize_name(ireq.name)
     extras = f"[{','.join(sorted(ireq.extras))}]" if ireq.extras else ""
-    direct_reference = f"{lowered_ireq_name}{extras} @ {ireq.link.url_without_fragment}"
+
+    url = ireq.link.url_without_fragment
+    git_url_match = re.match(r"git\+(.+)$", url)
+    if git_url_match:
+        git_cmd = ["git", "ls-remote"]
+        branch_match = re.match(r"git\+(.+)@([^/]+)$", url)
+        if branch_match:
+            url, branch = branch_match.groups()
+            git_cmd.extend(branch_match.groups())
+        else:
+            url, branch = git_url_match.group(1), ""
+            git_cmd.append(url)
+        try:
+            branch = (
+                subprocess.check_output(git_cmd).split()[0].decode("utf-8")  # nosec
+            )
+        except IndexError as e:
+            if not branch:
+                raise e
+            lines = (
+                subprocess.check_output(["git", "ls-remote", url])  # nosec
+                .decode("utf-8")
+                .splitlines()
+            )
+            with suppress(StopIteration):
+                branch = next(filter(lambda ln: ln.startswith(branch), lines)).split()[
+                    0
+                ]
+        url = f"git+{url}@{branch}"
+    direct_reference = f"{lowered_ireq_name}{extras} @ {url}"
+
     fragments = []
 
     # Check if there is any fragment to add to the URI.

AndydeCleyre avatar Nov 21 '23 06:11 AndydeCleyre