Poetry cache is not invalidated while cached virtual environment's name / location is not correct
Description:
When using 'poetry' as mode, a cache that was generated with an incorrect virtualenv name / path will never be invalidated until poetry.lock is changed, despite not working as a functional cache.
Details:
- The cache key is built from a hash of the content of
poetry.lock - What is cached is the entire content of
.cache/pypoetry/virtualenvs/ - In the problematic workflows, the hit cache contained the expected dependencies, but with an incorrect venv name (here
spindevops-GlKMN5CI-py3.9), and thus at the wrong path
# Cache hit decompression
2024-07-01T15:52:01.3617921Z [command]/usr/bin/tar -tf /home/runner/work/_temp/97a9f9c2-8443-415f-a3b4-5f5721353865/cache.tzst -P --use-compress-program unzstd
2024-07-01T15:52:01.3716609Z ../../../.cache/pypoetry/virtualenvs/
2024-07-01T15:52:01.3718377Z ../../../.cache/pypoetry/virtualenvs/spindevops-GlKMN5CI-py3.9/
[...]
2024-07-01T15:52:09.4686405Z Cache Size: ~93 MB (97472830 B)
2024-07-01T15:52:09.4687371Z [command]/usr/bin/tar -xf /home/runner/work/_temp/97a9f9c2-8443-415f-a3b4-5f5721353865/cache.tzst -P -C /home/runner/work/spindevops/spindevops --use-compress-program unzstd
2024-07-01T15:52:09.4687574Z Cache restored successfully
2024-07-01T15:52:09.4688278Z ##[debug]pythonLocation is /opt/hostedtoolcache/Python/3.9.19/x64/bin/python
2024-07-01T15:52:09.4688742Z [command]/home/runner/.local/bin/poetry env use /opt/hostedtoolcache/Python/3.9.19/x64/bin/python
2024-07-01T15:52:09.4689235Z Creating virtualenv spindevops-i9_S8efx-py3.9 in /home/runner/.cache/pypoetry/virtualenvs
2024-07-01T15:52:09.4689701Z Using virtualenv: /home/runner/.cache/pypoetry/virtualenvs/spindevops-i9_S8efx-py3.9
- Poetry then sees that the environment with the expected name (here
spindevops-i9_S8efx-py3.9) is empty, and runs a complete installation when poetry install is run - In post-job cleanup, setup-python sees a cache hit, and thus doesn’t refresh cache
- A cache using an incorrect venv name is thus never invalidated until the next change of
poetry.lock, despite being effectively useless - Deleting the incorrect cache fixes the issue
Action version: v5
Platform:
- [x] Ubuntu
- [ ] macOS
- [ ] Windows
Runner type:
- [x] Hosted
- [ ] Self-hosted
Tools version: Python 3.9.19
Repro steps:
I don't know how to reproduce, because I'm unsure how an incorrect Poetry virtualenv was generated and cached in the first place. What's in our action is this step:
- name: Setup python
uses: actions/setup-python@v5
with:
python-version: 3.9
cache: poetry
Expected behavior: Cache is invalidated if the cached virtual environment is incorrect / a new venv is created
Actual behavior:
Cache is never invalidated until poetry.lock changes, even if the cached virtual environment is incorrect.
Hello @romaingd-spi 👋, Thank you for creating this issue. We will investigate it and get back to you as soon as we have some feedback.
Hi @romaingd-spi, When the pyproject.toml file is changed but the poetry.lock file is not updated, the cache will not automatically update. This occurs because the cache key is typically based on the poetry.lock file, which precisely represents the state of dependencies.
If the name in the pyproject.toml file is changed without updating the poetry.lock file, the cache key remains the same because it is based on the unchanged poetry.lock file. This results in a cache hit, using the existing cache. However, since the environment name has changed in pyproject.toml, Poetry detects this difference and creates a new virtual environment with the new name. Consequently, dependencies are reinstalled in the new virtual environment, even though the cache was hit.
To avoid such issues:
- Ensure that the poetry.lock file is updated whenever changes are made to the pyproject.toml file.
- Alternatively, running poetry install --no-cache can ensure that dependencies are correctly reinstalled without relying on the cache.
run: poetry install --no-cache
Currently, there is no functionality to validate the cache based on the virtual environment. This will be considered for potential future enhancements.
Hi @gowridurgad! Thank you for investigating and providing a detailed answer. I understand that there's no corresponding functionality currently, I don't think it's a major issue.
If the name in the pyproject.toml file is changed without updating the poetry.lock file
Note that the project name in the pyproject.toml file didn't change (to my knowledge), as visible (in the example) in the environment names created by poetry: spindevops-GlKMN5CI-py3.9 and spindevops-i9_S8efx-py3.9 both refer to a spindevops project. Additionally, we have checks in CI that ensure via poetry lock --check that the lockfile is consistent with pyproject.toml.
In my understanding, the environment name is created by Poetry using the project name, the path, and the Python version. Given that none of those changed, it's unclear to me how an incorrect venv name was generated and cached in the first place. The problem did not happen again after this issue was posted, so I guess we fell in some shady edge case.
Hi @romaingd-spi, If the project name, path, Python version, haven't changed, here are some additional reasons why a new virtual environment name might be generated by Poetry:
- Poetry Version Update: An update to Poetry itself might change how it hashes environments or manages virtual environments.
- System Configuration Changes: Changes to the system configuration or updates to dependencies that are globally installed on your machine could cause Poetry to perceive a difference.
- Virtual Environment Corruption: If the existing virtual environment becomes corrupted or if there are issues with its integrity, Poetry might create a new one.
- Machine-Specific Factors: If you're working on different machines or in different environments , this might influence the environment's hash.
Hope this clarifies things. If you need further investigation, please provide a link to the build or a public repository to reproduce the issue.
Thank you for the additional hints. The error unfortunately happened in a private repository. I'll investigate on my end if it happens again, and come back with additional information if it can bring further clarity. The core issue reported here, i.e. no functionality to validate the cache based on the venv itself, has been addressed by your previous comments. Thanks for the help!
Hello @romaingd-spi
I am proceeding with closing the issue. Please feel free to contact us in case of any further concerns.
Thank You !