Caching conflicts when using `extra` dependencies
Description: This may relate to #626, and it may also conflict with your stated anti-goals, but i believe it's worth bringing to the surface as a potential bug you may want to investigate, as I can't find a direct issue around it, and it may be impacting many users who rely on this action.
If you build different extras with your python project, each containing their own independent dependencies, and you want to test to ensure that each extra has all of its necessary dependencies in a job, while also checking overall lint/type safety/testing, you may run into this issue as I have.
When you specify the cache cache: poetry or cache: pip etc, and point to your requirements.txt or more up to date pyproject.toml, the cache key doens't take into account what you are installing in that job.
So, if I have a pyproject like so
...
[tool.poetry.dependencies]
python = ">=3.10.9,<3.11"
numpy = "^1.22.3"
boto3 = "^1.24.59"
pydantic = {version = "<2.0", extras = ["dotenv"]}
jinja2 = "^3.1.2"
openai = {version = "0.28", optional = true}
[tool.poetry.extras]
openai = ["openai"]
And in my first job, i use setup-python and then run
poetry install --all-extras
but in another job, I run
poetry install
One may assume that openai will not be installed in the second job. But if i'm using caching, regardless of what I install with, everything from the first cache creation will be installed.
I would think that the install command itself would generate the hash, rather than the dependency file itself.
Based on your non-goals, I understand if this isn't something you want to pursue, but it might be worth documenting in a overly-clear way for users who may not understand this behavior upfront.
Thank you!
Hello @Ben-Epstein , I have attempted to reproduce the issue on my end, but was unable to do so. In my test environment, the extras(openai) are not installed in the second job that use poetry install. Here's a screenshot for your reference. Could you assist by sharing a link to a simplified version that reproduces the problem? Thank you!
Hello @Ben-Epstein Just a gentle reminder!
Hi @Ben-Epstein, Could you please assist by sharing a link to a simplified version that reproduces the problem? Thank you!
Hi @gowridurgad sorry about that. I will take a look today to reproduce. Did you use poetry in that example? My project uses poetry so I'll try that.
Hi @gowridurgad I'm so sorry for the delay in the response.
I've reproduced the issue and shared it in this PR https://github.com/Ben-Epstein/poetry-setup-python-bug/pull/2
Here are the critical steps to reproduce:
- Kick off a job that installs all dependencies through poetry (ie
poetry install --all-extras) - After the cache from that job is created, then change the install to be
poetry installwithout the extras, but you'll see that there is a cache hit and packages you do not expect to be installed are in fact installed.
You can see steps 1 and 2 in the following commits:
- this commit creates the cache, and in the repo there is only 1 cache.
- this commit then allows the second job to run, and in the corresponding action you can see that it picked up the cache created from (1). You can see in the second step
poetry run pip listthat there are all of the extra dependencies that were installed from the commit in (1) when runningpoetry install --all-extrasthat shouldn't be there, since we are runningpoetry install. IE, there should have been a cache miss.
Hi @Ben-Epstein , The reason for this behavior is that the cache key didn't change between the two jobs and the caching mechanism is designed to reuse the cache if it finds one with the same key. To avoid this situation, you might consider using different cache keys for jobs with different requirements using actions/cache. Here is the screenshot for your reference. we will update the document accordingly.
Hello @Ben-Epstein , The PR has been merged and the Anti-Goals for caching poetry dependencies are updated in the document . For reference, you may visit the https://github.com/actions/setup-python/blob/main/docs/adrs/0000-caching-dependencies.md. Thank You !