LightGBM icon indicating copy to clipboard operation
LightGBM copied to clipboard

[RFC] [ci] move management of CI images into this repo?

Open jameslamb opened this issue 4 months ago • 4 comments

Description

Recently, we faced rate-limiting issues from DockerHub on Azure DevOps jobs (https://github.com/microsoft/LightGBM/pull/6866#pullrequestreview-2912681013).

@shiyu1994 worked around those by manually pushing the relevant images to Azure Container Registry repos: https://github.com/microsoft/LightGBM/pull/6866#discussion_r2280491605

No one else was given access to those repos and no plan was communicated for how we might get updates from https://github.com/guolinke/lightgbm-ci-docker into those repos.

This inspired an idea.... what if we build those CI images from right here in the LightGBM repo and publish them to the GitHub Container Registry (ghcr.io)?

Opening this to discuss that.

Benefits of this work

  • reduces the risk of of rate-limiting issues
    • ref: https://github.com/orgs/community/discussions/49671#discussioncomment-8795596
  • removes a dependency on @guolinke 's personal GitHub account and personal DockerHub account
    • where no one else can be an admin: ef: https://github.com/guolinke/lightgbm-ci-docker/pull/30#issuecomment-1774371311
  • consolidates permissions / access management... if you have write access to this repo, you have access to update the CI images
    • less risk of being blocked because one person is unavailable
    • removes reliance on @guolinke 's personal GitHub account and personal DockerHub credentials
  • allows for PRs that both update the images AND update other code in the repo

Acceptance criteria

  • LightGBM CI does not rely on https://github.com/guolinke/lightgbm-ci-docker or https://hub.docker.com/repository/docker/lightgbm/vsts-agent/tags

Approach

I built a proof-of-concept in another repo: https://github.com/jameslamb/lightgbm-dask-testing/pull/75

I'm proposing the following:

  • Dockerfiles and other context for the images are checked into source control here in LightGBM
  • GitHub Actions workflow(s) for publishing those images, with the following characteristics:
    • only triggered by workflow_dispatch (a maintainer clicking a button in the GItHub UI)
    • can be triggered from any branch
  • images are only for CI... not proposing (for now) using this mechanism to publish user-facing images that include LightGBM (ref: https://github.com/microsoft/LightGBM/pull/6638#issuecomment-2351330468)

Notes

If folks are supportive of trying this, I'll open a draft PR showing what I'm thinking of. I think this could really help here, and make it easier to do updates like #5596.

cc @StrikerRUS @jmoralez @guolinke @shiyu1994 @borchero

jameslamb avatar Aug 28 '25 03:08 jameslamb

I am +1

guolinke avatar Aug 28 '25 05:08 guolinke

Great idea! Much better than my previous proposal to use Google mirror to overcome Docker rate limits.

StrikerRUS avatar Aug 28 '25 09:08 StrikerRUS

Thanks! Sorry, I forgot to mention that Google mirror idea here too.

I think this GHCR thing could be really nice not ONLY for rate limits but also just for general development. I'll put up something to review as time allows 😁

jameslamb avatar Aug 28 '25 15:08 jameslamb

Little late but definitely +1 from my side :)

borchero avatar Oct 12 '25 19:10 borchero