setup-python icon indicating copy to clipboard operation
setup-python copied to clipboard

Caching does not work when using an internal package repo

Open screig opened this issue 11 months ago • 11 comments

Description: The caching feature wont work if one is pip installing from another python package repository.

Action version: actions/setup-python@v5

Platform:

  • [X ] Ubuntu
  • [ ] macOS
  • [ ] Windows

Runner type:

  • [X ] Hosted
  • [ ] Self-hosted Current runner version: '2.321.0' Runner Image Image: ubuntu-22.04 Version: 20241201.1.0

Tools version: I think it applies across Python versions

Repro steps:
The code is on our internal repo so I can share a link, here is the yaml

First we set up Python, and specify that we wish to cache.

      - uses: actions/setup-python@v5
        with:
          python-version: '3.9'
          cache: 'pip'
          cache-dependency-path: requirements*.txt

Next I install two sets of requirements

      - name: Install dependencies f
        run: |
          echo "Python version is : ${{ matrix.python-version }}"
          pip --version
          pip install -r requirements_for_testing.txt        
          pip install -r requirements.txt --extra-index-url https://${ado_token}@OUR_INETERNAL_ADO_PACKAGE REPO/_packaging/sepypi/pypi/simple/
          pip --version
          pip list

requirements_for_testing.txt contains packages that we can get from pypi. Now here the caching works fine.

image

Next we need to install requirements that include our internal packages on our internal ADO package repo.

pip install -r requirements.txt --extra-index-url https://${ado_token}@OUR_INETERNAL_ADO_PACKAGE REPO/_packaging/sepypi/pypi/simple/

Here the caching never works.

image

Expected behaviour: I would expect it to cache.

Actual behaviour: Its not caching...

screig avatar Dec 13 '24 15:12 screig

Hello @screig, Thank you for creating this issue. We will investigate it and provide feedback as soon as we have some updates.

gowridurgad avatar Dec 16 '24 05:12 gowridurgad

Hi @screig ,

After investigating, we identified several issues causing the caching to fail, such as secret handling and syntax errors in the pip install command. Initially, the following command was used:

pip install -r requirements.txt --extra-index-url https://${ado_token}@OUR_INETERNAL_ADO_PACKAGE_REPO/_packaging/sepypi/pypi/simple/

However, this format was invalid for pip install. We made the following fix:

pip install -r requirements.txt --extra-index-url https://${{secrets.ado_token}}@OUR_INETERNAL_ADO_PACKAGE_REPO.git@main#egg=simple&subdirectory=_packaging/sepypi/pypi/simple

This resolved the caching issue, but there was a deprecation warning with --extra-index-url. To address this, we updated the commands as follows:

pip install -r requirements.txt --extra-index-url https://${{ secrets.ado_token }}@https://github.com
pip install git+@OUR_INETERNAL_ADO_PACKAGE_REPO.git@main#egg=simple&subdirectory=_packaging/sepypi/pypi/simple

We addressed the following issues to resolve the caching problem:

  1. Ensured that secrets are correctly referenced using ${{ secrets.ado_token }} in the workflow file to avoid "bad substitution" errors.
  2. Used the git+https syntax to install packages directly from the repository.

If these workarounds do not resolve the issue, please provide a link to the build or the public repository to help us further investigate.

lmvysakh avatar Dec 24 '24 06:12 lmvysakh

Hi

The repository I am referencing is provided by Azure and is intended to provide packages (wheels) and is not a code-repository and therefore is not using git.

It looks rather like this

image

Image from here, not mine.

As in this image, the service (Azure Dev Ops / Azure Artifacts) tells me the address to connect to, in order to be able to install packages using pip, as so:

image

I am not connecting to a git (code) repository where I have the option to use git+https syntax.

Here is the azure documentation for the package repo service. I believe across the Microsoft/Github world this is the only service that can provide a python package repo.

The GitHub Package Registry cannot handle Python package and I think GitHub has ruled out supporting Python packages in the future through this.

screig avatar Dec 24 '24 17:12 screig

Hi @screig,

I'd like to provide some clarity on the caching capabilities supported by the setup-python action and its limitations:

The setup-python action supports caching for Python packages installed from the public PyPIrepository. Additionally, if you have a requirements.txt file, the action can cache the dependencies listed in it. However, caching may not work seamlessly for packages installed from internal or private repositories, such as Azure Artifacts.

You can find the documentation for the same below:

However, we are suggesting a workaround to access the Azure Artifacts repository by the following:

  1. Create an Azure Artifacts feed in your Azure DevOps project.
  2. Generate a Personal Access Token (PAT) with the Packaging scope in Azure DevOps.
  3. Store the PAT as a secret in your GitHub repository (e.g., AZURE_ARTIFACTS_PAT).
import os
import subprocess

# Set up the environment variable for the PAT
os.environ['AZURE_ARTIFACTS_PAT'] = 'your_personal_access_token'

# Define the URL for the Azure Artifacts feed
feed_url = "https://pkgs.dev.azure.com/your_organization/your_project/_packaging/your_feed/pypi/simple/"

# Configure pip to use the Azure Artifacts feed
subprocess.run([
    'pip', 'install', '--extra-index-url', f"https://{os.environ['AZURE_ARTIFACTS_PAT']}:@{feed_url}",
    'your_package_name'
], check=True)

Here is an example GitHub Actions workflow that sets up Python, configures access to the Azure Artifacts feed, and installs packages:

name: Python CI

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  build:
    runs-on: ubuntu-latest

    steps:
    - name: Check out repository
      uses: actions/checkout@v3

    - name: Set up Python
      uses: actions/setup-python@v5
      with:
        python-version: '3.9'

    - name: Install dependencies from Azure Artifacts
      env:
        AZURE_ARTIFACTS_PAT: ${{ secrets.AZURE_ARTIFACTS_PAT }}
      run: |
        pip install --upgrade pip
        pip config set global.extra-index-url "https://${AZURE_ARTIFACTS_PAT}:@pkgs.dev.azure.com/your_organization/your_project/_packaging/your_feed/pypi/simple/"
        pip install -r requirements.txt
  1. Replace placeholders like your_personal_access_token, your_organization, your_project, your_feed, and your_package_name with your actual details.
  2. Ensure that the AZURE_ARTIFACTS_PAT secret is properly set in your GitHub repository.

lmvysakh avatar Jan 22 '25 08:01 lmvysakh

Hi

I tried this approach of setting the global index url

Image

However I can see no difference

Image

When using the internal package repo the action is still not caching...

Image

screig avatar Jan 23 '25 16:01 screig

Hi @screig,

Currently, the setup-python action does not support caching for external packages, such as Azure Artifacts. The caching functionality is primarily designed for packages from the PyPI repository and dependencies listed in a requirements.txt file.

We recognise the importance of supporting a wider range of repositories and are considering enhancements for future updates. In the meantime, you may need to implement custom caching solutions or use other methods to manage dependencies from Azure Artifacts.

Thank you for your understanding. Please feel free to reach out if you have any further concerns or need additional clarification!

lmvysakh avatar Feb 03 '25 09:02 lmvysakh

Hi @screig,

We recommend trying out caching with actions/cache, as it can be used to cache Python dependencies. Below is an example workflow demonstrating the same:

- name: Set up Python
  uses: actions/setup-python@v5
  with:
    python-version: '3.9'

- name: Cache dependencies
  uses: actions/cache@v4
  with:
    path: ~/.cache/pip
    key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }}
    restore-keys: |
      ${{ runner.os }}-pip-

- name: Install dependencies from Azure Artifacts
  env:
    AZURE_ARTIFACTS_PAT: ${{ secrets.AZURE_ARTIFACTS_PAT }}
  run: |
    pip install --upgrade pip
    pip config set global.extra-index-url "https://${AZURE_ARTIFACTS_PAT}:@pkgs.dev.azure.com/your_organization/your_project/_packaging/your_feed/pypi/simple/"
    pip install -r requirements.txt

Thank you for your understanding. Please feel free to reach out if you have any further concerns or need additional clarification.

lmvysakh avatar Feb 14 '25 11:02 lmvysakh

Hello @screig,

Just a gentle reminder! Could you please let us know if there are any updates from your side regarding this issue?

Thank you!

lmvysakh avatar Feb 25 '25 09:02 lmvysakh

Hi

If it does not support caching other than packages coming from pypi.org I think that limits its use to strictly open source projects. In my use case I am using GitHub in an enterprise environment where we have proprietary packages.

GitHub previously dropped support for supporting python packages natively with GitHub packages repos. Collectively these two decisions, really limit what we can do with GitHub. Python is now far and away the most popular programming language and I find it a bit surprising that GitHub is seemingly retreating in providing tooling to work with that language.

Sean

screig avatar Feb 25 '25 12:02 screig

Hello @screig,

Thank you for your feedback and for sharing your use case. The current limitation when using the setup-python action for caching with composite actions arises from the fact that it supports relative paths, while absolute paths are not yet allowed. This could be considered a feature request in the future. Following is the code snippet for the same.

# Step 1: Checkout code
- name: Checkout code
  uses: actions/checkout@v4
# Step 2: Create symbolic link for requirements.txt
- name: Create symbolic link for requirements.txt
  run: ln -s ${{ github.action_path }}/requirements.txt
  shell: bash

We understand that the current caching support, which is limited to packages from pypi.org, may not fully meet the needs of enterprise environments that rely on proprietary packages. We appreciate your input regarding the impact of this limitation and the previous changes related to Python package support in GitHub Packages. Your feedback is valuable to us, and we will take it into consideration as we evaluate and plan future improvements. If you have any specific suggestions or requirements, please feel free to share them with us.

Thank you for your understanding and patience.

lmvysakh avatar Mar 13 '25 09:03 lmvysakh

Hello  @screig,

Just a gentle reminder! Could you please let us know if there are any updates from your side regarding this issue?

Thank you!

lmvysakh avatar Mar 24 '25 10:03 lmvysakh

Hello @screig

Due to not receiving a response for a long time, we are going to close this issue for now. Please feel free to reach us in case of any concerns or further clarifications are required to reopen this issue. Thank you!

lmvysakh avatar Apr 01 '25 08:04 lmvysakh

Hi,

We are also interested in caching when using an internal python package repo. Based on my understanding from the conversation, this now sounds more like a feature request than a bug. As such, can it be re-opened as a feature request?

Cheers.

mallman-experian avatar Apr 10 '25 17:04 mallman-experian

I was able to get this working. These are the steps we define in our workflow yaml:

    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          cache: 'pip'
          cache-dependency-path: |
            requirements/base.txt
            requirements/ci.txt
      - run: |
          pip install -r requirements/base.txt
          pip install -r requirements/ci.txt

We also set PIP_EXTRA_INDEX_URL in the top-level env key.

As you can see, we are specifying the requirements files explicitly.

@screig Does this help?

mallman-experian avatar Apr 10 '25 21:04 mallman-experian

Hi @screig ,

We are reopening this issue to look into the requested feature and we will get back to you once we have some feedback on the same.

lmvysakh avatar Apr 14 '25 10:04 lmvysakh

Hello, Thanks for raising this request and sharing the context. We understand the need for caching support when using internal package repositories and agree that it would improve performance and consistency in enterprise workflows. That said, after reviewing the proposal, we’re not planning to move forward with this enhancement at this time based on the complexity around handling various authentication mechanisms and internal configurations makes this a challenging feature to support reliably. We’ll continue to monitor interest in this area and revisit the idea if demand increases. Thank you for your continued engagement and for helping us prioritize improvements. Your feedback is always appreciated!

aparnajyothi-y avatar Jun 27 '25 16:06 aparnajyothi-y

Hello Everyone, Please let us know if you have any concerns on the above :)

aparnajyothi-y avatar Jul 02 '25 14:07 aparnajyothi-y

Hello Everyone, Please let us know if you have any concerns on the above :)

aparnajyothi-y avatar Jul 08 '25 07:07 aparnajyothi-y

Hello Everyone, we are proceeding to close this issue after two reminders as we didn't hear anything from you for a long time. Please feel free to reach us in case of any concerns / clarifications required :)

aparnajyothi-y avatar Jul 16 '25 12:07 aparnajyothi-y