poetry icon indicating copy to clipboard operation
poetry copied to clipboard

Intermittent Hash Validation Failures

Open rohanp-eiq opened this issue 2 years ago • 16 comments

  • Poetry version: 1.4.2
  • Python version: 3.10.6
  • OS version and name: Ubuntu 22.04.2 LTS (running in a docker image based on this)
  • pyproject.toml: https://gist.github.com/rohanp-eiq/3171fb1561994642b2580313b1939a29
  • [x] I am on the latest stable Poetry version, installed using a recommended method.
  • [x] I have searched the issues of this repo and believe that this is not a duplicate.
  • [x] I have consulted the FAQ and blog for any relevant entries or release notes.
  • [x] If an exception occurs when executing a command, I executed it again in debug mode (-vvv option) and have included the output below.

Issue

Hi, we are running into an issue where we intermittently see libraries failing to install with the command: poetry install -vvv --with dev,test --sync with the following error (torch is an example, this has failed on other libraries):


  7  ~/.local/pipx/venvs/poetry/lib/python3.10/site-packages/poetry/installation/executor.py:280 in _execute_operation
      278│ 
      279│             try:
    → 280│                 result = self._do_execute_operation(operation)
      281│             except EnvCommandError as e:
      282│                 if e.e.returncode == -2:

  6  ~/.local/pipx/venvs/poetry/lib/python3.10/site-packages/poetry/installation/executor.py:382 in _do_execute_operation
      380│             return 0
      381│ 
    → 382│         result: int = getattr(self, f"_execute_{method}")(operation)
      383│ 
      384│         if result != 0:

  5  ~/.local/pipx/venvs/poetry/lib/python3.10/site-packages/poetry/installation/executor.py:502 in _execute_install
      500│ 
      501│     def _execute_install(self, operation: Install | Update) -> int:
    → 502│         status_code = self._install(operation)
      503│ 
      504│         self._save_url_reference(operation)

  4  ~/.local/pipx/venvs/poetry/lib/python3.10/site-packages/poetry/installation/executor.py:540 in _install
      538│             archive = self._download_link(operation, Link(package.source_url))
      539│         else:
    → 540│             archive = self._download(operation)
      541│ 
      542│         operation_message = self.get_operation_message(operation)

  3  ~/.local/pipx/venvs/poetry/lib/python3.10/site-packages/poetry/installation/executor.py:715 in _download
      713│             self._yanked_warnings.append(message)
      714│ 
    → 715│         return self._download_link(operation, link)
      716│ 
      717│     def _download_link(self, operation: Install | Update, link: Link) -> Path:

  2  ~/.local/pipx/venvs/poetry/lib/python3.10/site-packages/poetry/installation/executor.py:754 in _download_link
      752│ 
      753│         # Use the original archive to provide the correct hash.
    → 754│         self._populate_hashes_dict(original_archive, package)
      755│ 
      756│         return archive

  1  ~/.local/pipx/venvs/poetry/lib/python3.10/site-packages/poetry/installation/executor.py:760 in _populate_hashes_dict
      758│     def _populate_hashes_dict(self, archive: Path, package: Package) -> None:
      759│         if package.files and archive.name in {f["file"] for f in package.files}:
    → 760│             archive_hash = self._validate_archive_hash(archive, package)
      761│             self._hashes[package.name] = archive_hash
      762│ 

  RuntimeError

  Hash for torch (2.0.0) from archive torch-2.0.0-cp310-cp310-manylinux1_x86_64.whl not found in known hashes (was: sha256:1056dbd19648e16b410f610ae6e556a783ed566b18ddcb49c8af688c70748e48)

  at ~/.local/pipx/venvs/poetry/lib/python3.10/site-packages/poetry/installation/executor.py:769 in _validate_archive_hash
      765│         archive_hash: str = "sha256:" + get_file_hash(archive)
      766│         known_hashes = {f["hash"] for f in package.files if f["file"] == archive.name}
      767│ 
      768│         if archive_hash not in known_hashes:
    → 769│             raise RuntimeError(
      770│                 f"Hash for {package} from archive {archive.name} not found in"
      771│                 f" known hashes (was: {archive_hash})"
      772│             )
      773│ 

We have seen this mostly occur in Github actions. The relevant steps are:

  - name: Install Poetry
        run: pipx install poetry

  - uses: actions/[email protected]
    with:
      python-version: "${{ env.PYTHON_VERSION }}" # this is 3.10.6

   - name: install deps
      run: |
      poetry install --with dev,test --sync 

We see intermittent issues with different libraries failing on hash checks, and retrying with no changes ends up fixing the problem.

We've tried adding steps like: poetry cache clear . --all and

rm -rf ~/.cache/pypoetry/cache
rm -rf ~/.cache/pypoetry/artifacts
poetry lock --no-update

but haven't had any luck. Additionally, we've verified that the runner isn't caching data in a way that would be causing issues.

Thank you for any help in advance!

rohanp-eiq avatar May 05 '23 15:05 rohanp-eiq

Well indeed that's not the correct hash for that file - see https://pypi.org/project/torch/#copy-hash-modal-6e25311f-3fc9-4584-a403-74f06dd31ce3. poetry is protecting you from installing the wrong thing.

perhaps you are short of disk space and are getting incomplete downloads.

dimbleby avatar May 05 '23 16:05 dimbleby

Thanks for the help!

We checked the kuberenetes pods that we're running this on - we have about 100GB+ of free disk space mounted to the paths that poetry would install to. We were able to dig a bit deeper and did see that one of our dependencies, tensorflow downloaded a wheel that was ~100MB (package on pypi is ~585MB) which makes sense as to why the hash is wrong but we're unable to understand why this is still happening or what could be possible contributors.

rohanp-eiq avatar May 09 '23 19:05 rohanp-eiq

Also experiencing this in docker on macos (poetry 1.5.1, python 3.11)

RuntimeError                                    
                                                    
  Hash for scipy (1.10.1) from archive scipy-1.10.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl not found in known hashes (was: sha256:b323423e76ae0f6fb4aace295d5a95d9c49ac735bf0b324b327b7db472619490)                                              
                                                    
  at /usr/local/lib/python3.11/site-packages/poetry/installation/executor.py:818 in _validate_archive_hash                                                                                                         
      814│         archive_hash: str = "sha256:" + get_file_hash(archive)                                
      815│         known_hashes = {f["hash"] for f in package.files if f["file"] == archive.name}        
      816│                                        
      817│         if archive_hash not in known_hashes:                                                  
    → 818│             raise RuntimeError(                                                               
      819│                 f"Hash for {package} from archive {archive.name} not found in"                
      820│                 f" known hashes (was: {archive_hash})"                                        
      821│             )                            
      822│                                                                              
RuntimeError                          
                                                    
  Hash for torch (2.0.1) from archive torch-2.0.1-cp311-cp311-manylinux1_x86_64.whl not found in known hashes (was: sha256:c3ba63617c35ff58a95e6a4f7e9fb5fea3153cfacc8573bf392d01c08d24f129)                       
                                                    
  at /usr/local/lib/python3.11/site-packages/poetry/installation/executor.py:818 in _validate_archive_hash                                                                                                         
      814│         archive_hash: str = "sha256:" + get_file_hash(archive)                                
      815│         known_hashes = {f["hash"] for f in package.files if f["file"] == archive.name}        
      816│                      
      817│         if archive_hash not in known_hashes:                                                  
    → 818│             raise RuntimeError(
      819│                 f"Hash for {package} from archive {archive.name} not found in"                
      820│                 f" known hashes (was: {archive_hash})"                                        
      821│             )                  
      822│                       

Other failed packages include pydantic and numpy. pydantic is very small distribution so seems less likely an incomplete download would be to blame here.

Not able to reproduce in a linux VM nor docker on a linux VM.

micahjsmith avatar Jun 07 '23 15:06 micahjsmith

Again, poetry is correct to refuse to install an archive with the wrong hash.

You'll maybe want to inspect the faulty archives - probably available in the poetry cache - and see if you can figure out what's wrong with them and why.

But unless you can find a way in which poetry is doing Something Wrong during download - unlikely since it works for almost everyone and after all requests is pretty widely used - this should likely just be closed. It seems unlikely that there's anything this repository can do about whatever it is you're seeing.

dimbleby avatar Jun 07 '23 16:06 dimbleby

@dimbleby I definitely agree that poetry is doing the Right Thing (TM) regarding the faulty archives.

I guess the theory is it is a storage or network issue? I would expect the download to fail outright rather than the hashes to be missing later in the install process? Either way would you please be able to link to docs or suggest a way to identify the actual archive file? Contents of ~/.cache/pypoetry are typically just hashes and don't seem to contain the faulty archives.

micahjsmith avatar Jun 07 '23 17:06 micahjsmith

.whl files can also be found somewhere in that directory

dimbleby avatar Jun 07 '23 18:06 dimbleby

I would recommend deleting the cache and artifacts folders, deleting your .venv, and running poetry lock. That solved it for me.

rbebb avatar Jun 08 '23 14:06 rbebb

Deleting the poetry artifacts and cache in ~/.cache/pypoetry worked.

rm -rf ~/.cache/pypoetry/cache/
rm -rf ~/.cache/pypoetry/artifacts

so the remote archive (torch for cpu) seems fine in my case.

Flova avatar Jun 12 '23 14:06 Flova

I'm seeing such intermittent hash failures on CI, which is a completely fresh environment, no cache involved. Happens about 1 in 10 runs it seems, and always on the same package mysql-connector-python for me:

  Hash for mysql-connector-python (8.0.33) from archive mysql_connector_python-8.0.33-cp310-cp310-manylinux1_x86_64.whl not found in known hashes (was: sha256:29d15124ce60ee6801fa3ac92927fca06e07636440dbd5b34cb78c3febd682f3)

Poetry 1.5.1 (also tested 1.3.1, same issue) Python 3.10.6 Ubuntu 22.04

silverwind avatar Jul 04 '23 08:07 silverwind

Experiencing this with Docker on MacOS (Intel) with the torch and numpy packages with both Poetry 1.5.1 and 1.4.0 (Poetry installed via pipx) and Python 3.11.4:

 RuntimeError

  Hash for torch (2.0.1) from archive torch-2.0.1-cp311-cp311-manylinux1_x86_64.whl not found in known hashes (was: sha256:895e5689bf7f80726b0a84d33c8222a17f55c608802c091498a0206c163cf96a)

  at /usr/local/py-utils/venvs/poetry/lib/python3.11/site-packages/poetry/installation/executor.py:754 in _validate_archive_hash
      750│         archive_hash: str = "sha256:" + get_file_hash(archive)
      751│         known_hashes = {f["hash"] for f in package.files}
      752│ 
      753│         if archive_hash not in known_hashes:
    → 754│             raise RuntimeError(
      755│                 f"Hash for {package} from archive {archive.name} not found in"
      756│                 f" known hashes (was: {archive_hash})"
      757│             )
      758│ 

byt3bl33d3r avatar Jul 18 '23 20:07 byt3bl33d3r

#8235 establishes that incomplete-downloads-with-no-apparent-error (because flakey connections) is fixed in the next release, via urllib3 2.0

probably safe to assume that's the primary cause here, close this out, and invite reporting of any further issues after poetry 1.6.0

dimbleby avatar Jul 24 '23 21:07 dimbleby

We still see this heavily with 1.6.1. My engineers constantly run into this with Docker builds, and just rerunning the build will magically fix it. At this point I'm going to write a custom wrapper that tries to run poetry 3 times before dying, but ideal would be some flag that retries a download if the hash doesn't match.

I agree that this is probably not something poetry is responsible for, but it would be a huge help for those of us suffering this if poetry could handle a few retries on hash failures.

apenney avatar Sep 22 '23 14:09 apenney

Still seeing this error

(mygpt-py3.11) PS C:\Users\64478\code\jiangying000\gb-be1> python --version   
Python 3.11.9


(mygpt-py3.11) PS C:\Users\64478\code\jiangying000\gb-be1> poetry --version
Poetry (version 1.8.3)


(mygpt-py3.11) PS C:\Users\64478\code\jiangying000\gb-be1> poetry install     
Installing dependencies from lock file

Package operations: 144 installs, 0 updates, 0 removals

  - Installing scipy (1.12.0): Failed

  RuntimeError

  Hash for scipy (1.12.0) from archive scipy-1.12.0-cp311-cp311-win_amd64.whl not found in known hashes (was: sha256:efeee7d6414d1b4ae5f3025dd5c193b5a8aa03c3c1a0a47132ae07d45859325f)

  at ~\pipx\venvs\poetry\Lib\site-packages\poetry\installation\executor.py:812 in _validate_archive_hash
      808│ 
      809│         archive_hash = f"{hash_type}:{get_file_hash(archive, hash_type)}"
      810│
      811│         if archive_hash not in known_hashes:
    → 812│             raise RuntimeError(
      813│                 f"Hash for {package} from archive {archive.name} not found in"
      814│                 f" known hashes (was: {archive_hash})"
      815│             )
      816│

Cannot install scipy.

(mygpt-py3.11) PS C:\Users\64478\code\jiangying000\gb-be1> 

i can solve this by adding

[[tool.poetry.source]]
name = "mirrors"
url = "https://pypi.tuna.tsinghua.edu.cn/simple/"
priority = "primary"

and run poetry install again

jiangying000 avatar Jul 10 '24 14:07 jiangying000

the hash that poetry reports is indeed the wrong value, so it is correct not to install the wheel

probably at some point in the past you have downloaded a partial or corrupt version of the file and that is now in your cache and you should fix by clearing your cache

most likely that download happened before #8235 and there is no current bug

dimbleby avatar Jul 10 '24 16:07 dimbleby

I'm not sure, but will keep an eye on it

jiangying000 avatar Jul 10 '24 16:07 jiangying000

Deleting the poetry artifacts and cache in ~/.cache/pypoetry worked.

rm -rf ~/.cache/pypoetry/cache/ rm -rf ~/.cache/pypoetry/artifacts

so the remote archive (torch for cpu) seems fine in my case.

For the record, I had to do this today because a couple of packages were running into this hashing problem on the latest version of Poetry (2.1.1). Clearing the caches was not enough. I had to delete the artifacts folder. For anyone on Windows, the following paths were:

~/AppData/Local/pypoetry/Cache/cache
~/AppData/Local/pypoetry/Cache/artifacts

jrg94 avatar Mar 13 '25 22:03 jrg94