poetry
poetry copied to clipboard
“poetry cache clear” failed to remove corrupted artifacts
- [x] I am on the latest Poetry version.
- [x] I have searched the issues of this repo and believe that this is not a duplicate.
- [ ] If an exception occurs when executing a command, I executed it again in debug mode (
-vvv
option).
- OS version and name: windows 10
- Poetry version: 1.1.14
- Link of a Gist with the contents of your pyproject.toml file: (none)
Issue
(I managed to solve the problem, but still decided to report this as I suppose this may be a solution to some hash&cache related issues)
I encountered an internet connection problem during installing NumPy to a new poetry environment, manually closed the poetry command window, and retried poetry install
after regaining internet connection, but got
Invalid hashes (sha256:81efafdaee436f1f34e9121d9ea96fa95d20c73390cd2874fe3ffc75cc86425a) for NumPy (1.23.1) using archive NumPy-1.23.1-cp39-cp39-win_amd64.whl.
Like the suggestions related to hash resolve issues, I tried poetry cache clear . --all
, poetry cache clear --all pypi
and deleting poetry.toml
, but I still got the same RuntimeError.
Then I tried to find the .whl document and found it in
C:\Users\(username)\AppData\Local\pypoetry\Cache\artifacts...
After that I deleted all documents in artifacts
directory, and then everything goes smoothly.
It seems that on windows the cache clear command only removes caches in C:\Users\(username)\AppData\Local\pypoetry\Cache\\**cache**
, but the artifacts document on the same level with "cache" is untouched. I believe some cache-related problems that couldn't be solved by running poetry cache clear
are related to this.
Tbh, I would expect poetry to automatically delete files with invalid hashes from the cache themselves (or via a command argument).
But for as long as that is not supported, I crafted the following command to delete all files that were created via an error:
poetry install | grep "not found in known hashes" | awk -F "archive " '{print $2}' | awk -F " not found" '{print $1}' | xargs -I{} find $POETRY_CACHE_DIR -name {} -delete
After that, the next poetry install
should download the corrupted files again, while keeping all others.
Tbh, I would expect poetry to automatically delete files with invalid hashes from the cache themselves (or via a command argument).
But for as long as that is not supported, I crafted the following command to delete all files that were created via an error:
poetry install | grep "not found in known hashes" | awk -F "archive " '{print $2}' | awk -F " not found" '{print $1}' | xargs -I{} find ~/.cache/pypoetry/ -name {} -delete
After that, the next
poetry install
should download the corrupted files again, while keeping all others.
I tried to fix this bug via changing the code about cache management, but it seems that poetry is using another pack to manage cache, therefore it's beyond my ability...Thanks anyway(haven't encountered this bug ever since the issue, but maybe it'll come in handy next time)
The main reason Poetry doesn't do this automatically at present is the observation that some sort of debounce or limited-retry logic might be needed. PRs are welcome fwiw.
Just to follow up on @helpmefindaname's very helpful script (ty @ helpmfn!). If you have an unstable connection, even just for a brief period, you might need to be doing this repeatedly if you're trying to set-up a new project (many installs), for example.
Here's a global alias you can use (it will also prompt before deleting any files):
alias poethashnotfoundworkaround='rg "not found in known hashes" | choose -f "archive" 1 | choose 0 | xargs -I_ fd _ $POETRY_CACHE_DIR | xargs -o rm -i'
- you can pipe to the global alias for simplicity
- It displays the file to be deletes and prompts each time.
- It assumes the user has ripgrep, fd, and choose installed. (one can do something similar with grep, find, and awk, as @helpmefindaname did. Just change their pipechain so the the last find doesn't do the deletion and instead pipes to
xargs -o rm -i
-- giving a user deletion prompt, for peace of mind :) - it uses POETRY_CACHE_DIR -- which you may have to set (e.g. in .zshenv), you can directly replace it in the command with the default for your system if you only have one and expect it to be static.
- this assumes you're on zsh and would go in your .zshrc file (should work on bash too); fish doesn't have aliases, just functions, and piping and xargs don't play nicely with them
So if Poetry gives you a result like this:
You can run then run it as so:
-
poetry add tqdm requests pydantic | poethashnotfoundworkaround
or, without aliasing, as:
-
poetry add tqdm requests pydantic | rg "not found in known hashes" | choose -f "archive" 1 | choose 0 | xargs -I_ fd _ $POETRY_CACHE_DIR | xargs -o rm -i
Repeating until it stops prompting you for deletions.
Then checking that installation occurred. (e.g. by just running the poetry add line again and verifying that everything is installed; we'll assume correctly!)
details, if you would like the cli commands parsed
Explanation for anyone that wants help parsing the above (I only just learned some of this myself):
-
<initial command>
- produces a bunch of text
-
| rg "not found in known hashes"
- we search through that text for any lines with "not found in known hashes" and send those
-
| choose -f "archive" 1
- we break each line into sections around the word "archive" and take the second (0-indexed) section
-
| choose 0
- we take all of those sections and take the first word (sections default to whitespace breaks)
-
| xargs -I_ fd _ $POETRY_CACHE_DIR
- we feed each output as an argument into the '_' place causing us to search for that file in the poetry cache directory
-
| xargs -o rm -i'
- we delete each file found with flags to prompt and allow user input before deleting
The
alias somelongname=...
would be specified in your.zshrc
or similar. (it doesn't need a global flag for it's intended purposes, which is to occur right after a pipe, but related uses could take one if it's aliaseness couldn't be inferred).
If you want to experiment with this you can just copy the output of some stuff to a textfile (whatevername.txt
) and then use that create input to your own pipe contraption with cat or bat
e.g.
bat whatevername.txt | rg "pattern unique to lines that I want" | choose -f 'pattern before what I want' 1
...etc
(re-emphasized note: I'm using a lot of enhanced commands (rg, fd, choose, et cetera) because I find them much nicer and think of them as contemporary defaults for the cli -- one can replace them all with their classic counterparts and some syntax switch ups)
The main reason Poetry doesn't do this automatically at present is the observation that some sort of debounce or limited-retry logic might be needed. PRs are welcome fwiw.
That sounds optimal. For now, I wonder whether clearing the artifacts that correspond to a given cache when the user runs poetry cache clear --all <cache>
would be a worthwhile step forward? I have no concept of how difficult that would be to implement. Another alternative might be to clear all caches and artifacts when no cache is specified. On Linux, my ~/.cache/pypoetry/cache/
directory is < 25 MB, whereas my ~/.cache/pypoetry/artifacts
directory is > 4 GB. This discrepancy comes at least in part from the fact that I regularly clear the former but not the latter.