OpenHands Refined `ghcr.yml` for Secure and Efficient Docker Operations

Description

We currently have a GitHub Actions setup that manages the build and push process for Docker images. This setup is triggered by both direct pushes and pull requests from forks. However, there are significant challenges related to security, efficiency, and complexity due to the way we handle different triggers. Specifically, our current workflow does not sufficiently differentiate between operations that are safe for PRs from forks and those that should only occur in trusted scenarios.

Current Issues

Security Risks: The existing workflow indiscriminately handles ghcr.io logins and image pushes regardless of the origin of the PR, potentially exposing sensitive credentials during builds from forks.
Inefficiency in Caching: Our current caching mechanisms do not differentiate between fork-based PRs and main repository operations, potentially leading to increased build times and cache pollution.
Workflow Complexity: The logic for determining when to push images and manage caches is complex and difficult to maintain, particularly as we scale up the number of Docker images or adjust workflow conditions. There are too much builds!

Proposed Enhancements

Enhanced Security Measures:

Conditional Execution: Implement steps that conditionally perform ghcr.io logins and image pushes. Specifically, these actions should only execute for non-fork PRs and direct repository pushes to prevent unauthorized access to ghcr.io credentials during the CI/CD process triggered by forks.

Optimized Caching Strategy:

Differentiated Caching: Utilize a more nuanced caching approach that varies based on the origin of the PR (fork vs. main repo). This strategy aims to reduce build times and prevent the risk of cache corruption from untrusted sources.

Implementation of Workflow Changes

jobs:
  docker-build:
    runs-on: ubuntu-latest
    steps:
      - name: Login to GitHub Container Registry
        if: github.event_name != 'pull_request' || !github.event.pull_request.head.repo.fork
        uses: docker/login-action@v3
        with:
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_PASSWORD }}
      - uses: docker/build-push-action@v5
        with:
          context: .
          file: ./Dockerfile
          push: ${{ github.event_name != 'pull_request' || !github.event.pull_request.head.repo.fork }}
          tags: ${{ github.repository_owner }}/image:latest
          cache-from: type=local,src=/tmp/.buildx-cache
          cache-to: type=local,mode=max,dest=/tmp/.buildx-cache
          platforms: linux/amd64,linux/arm64

Looks like there will be a lot of changes if we implement this change so I'm just giving a suggestion.

May 10 '24 02:05 Umpire2018

@Umpire2018 we do actually do a bit of differentiation here: https://github.com/OpenDevin/OpenDevin/blob/f8d4b1ab0d85edfc584d0b4cd1960c3968b5f336/.github/workflows/ghcr.yml#L68

We don't --push if it's a fork.

That said, I was considering switching this behavior so that we can test user contributions more easily.

It's definitely a little unsafe. A maintainer does have to click a button to allow the tests to run, but I'm sure people aren't thoroughly reviewing the code each time...

I believe we can modify GH settings so that credentials aren't mounted for forks

May 10 '24 02:05 rbren

We don't --push if it's a fork.

Thanks for the heads up and i also noticed that. But i can't bring up a solution of comprehensive improvement of build condition control by using docker/build-push-action@v5.

May 10 '24 03:05 Umpire2018

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

Jun 13 '24 01:06 github-actions[bot]