poetry icon indicating copy to clipboard operation
poetry copied to clipboard

Feat: support multiple hash algorithms

Open Ckarles opened this issue 2 years ago • 7 comments

Pull Request Check List

Resolves: #6301 #4085

  • [x] Added tests for changed code.
  • [ ] ~~Updated documentation for changed code.~~

Ckarles avatar Sep 12 '22 15:09 Ckarles

Pretty sure that this is at best a part of the solution: it's not enough to verify an md5 hash against a value in a lockfile, you also need to persuade poetry to put a non-sha256 hash into the lockfile in the first place.

Also you are likely calculating the same hash again and again, which is undesirable

dimbleby avatar Sep 12 '22 16:09 dimbleby

I have a PR cooking locally that attempts to solve this holistically -- however, I confess that reworking hash handling in a way that we can backport is rather difficult, and it may be easier to just forward-port the change from the 1.1 branch and consider the holistic rework something for 1.3 only.

neersighted avatar Sep 12 '22 16:09 neersighted

Thx for the quick review, I've been late to the party.

Please help me figure out if I'm reading this correctly:

When the lock file contains the package with its file+hash:

  • find a link on the repo <- fails if the lockfile hash_name:hash is not equal to the one on the repo Chooser._get_links()
  • validate the package by comparing lockfile hash with the computed hash from the downloaded archive Executor._validate_archive_hash()

When the lock file does not contains the package, Poetry will add a hash of the package according to the specific repository implementation, e.g.

  • for pypi repo, the hash is implied sha256 and extracted from the json pypi response file_info["digests"]["sha256"]
  • for legacy repo, hashname:hash is extracted from url.hash and written as-is in the lockfile.

If I understand correctly, the issues triggered by v1.2 regarding hashes are:

  1. If the hash from the lockfile is not of the same algo than the one on the repo, Poetry can't find a valid archive to download.
  2. sha256 use is hardcoded for archive validation with lockfile.
  3. There could be artifact repository implementations that Poetry does not support ?

2 has been addressed in v1.1 branch. 1 has no perfect solution afaik, on the top of my head. Is it safe to assume that, with current 1.2 implementation, as long as the hash provided by the repo doesn't change, the lockfile continues to be compatible ? If it is the case, as long as there is no breaking change on the artifact repo, there is no need to re-generate a lockfile, and if there is such breaking change on the repo, it will be obligatory to re-generate a lock file. If the previous statement is correct, it seems to be an acceptable behavior (as long as it is advertised properly during a failing install).

Ckarles avatar Sep 13 '22 11:09 Ckarles

The problem (with Nexus at least) is that the repo hashes do change for all versions of a package whenever any new version is pushed for that package.

qbedard avatar Sep 14 '22 14:09 qbedard

Why aren't we merging it?

Fabhi avatar Sep 20 '22 11:09 Fabhi

As others have pointed, I don't think this is good enough.

That particular piece of code has already been written in a better way on 1.1 branch, also other matters need to be addressed, particularly regarding the chooser and the pipy adapter.

I'll complete this MR before next week with a more solid implementation. It will not cover any scenario but will at least provide guidance in case the install from non-sha256 repos doesn't work (instructing to perform another lock). From what I've seen a better management of hashes requires some refactoring and changes to poetry-core (which is not the focus of that MR).

Ckarles avatar Sep 20 '22 11:09 Ckarles

As others have pointed, I don't think this is good enough.

That particular piece of code has already been written on a better way in 1.1 branch, also other matters need to be addressed, particularly regarding the chooser and the pipy adapter.

I'll complete this MR before next week with a more solid implementation. It will not cover any scenario but will at least provide guidance in case the install from non-sha256 repos doesn't work (instructing to perform another lock). From what I've seen a better management of hashes requires some refactoring and changes to poetry-core (which is not the focus on that MR).

Wonderful! Let me know how I can help. I don't have much experience with contributing to open source, but I am good enough with python.

Fabhi avatar Sep 20 '22 12:09 Fabhi

  1. Based on @Ckarles comment, will this PR fix the issue with md5-only hashed package or is it only partially adressing it?
  2. If someone can give me some pointers, I'd be happy to look into creating a PR that fixes this in poetry-core as/if necessary. We are "stuck" on an older version of Nexus that is beyond our control, and we cannot use poetry >=1.2 anywhere due to this bug.

cpnielsen avatar Oct 28 '22 08:10 cpnielsen

Superseded by #8118

Secrus avatar Jun 29 '23 21:06 Secrus

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

github-actions[bot] avatar Mar 03 '24 20:03 github-actions[bot]