spark-operator icon indicating copy to clipboard operation
spark-operator copied to clipboard

Update Release Workflows: Change Container Registry to Kubeflow's ghcr.io

Open vara-bonthu opened this issue 11 months ago • 7 comments

Description: Following the migration of our project from the GoogleCloudPlatform organization to the Kubeflow organization, there is a need to update our release GitHub workflows. This update involves changing the Docker container registry references within our Github workflows to align with our new organizational repository.

Current Configuration: The GitHub Actions workflows are currently configured to push Docker images to the following registry:

ghcr.io/googlecloudplatform/spark-operator

Required Changes: All references to the Docker container registry within our GitHub Actions workflows need to be updated to the new location under the Kubeflow organization:

ghcr.io/kubeflow/spark-operator

Scope of the Update:

  1. Identify and modify all instances in GitHub Actions workflows (*.yml files within the .github/workflows/ directory) where the container registry is referenced.
  2. Ensure that Docker push commands, image tags, and any other relevant configurations reflect the new registry path.
  3. Review and test the updated workflows to confirm that images are correctly being pushed to the new registry location.

vara-bonthu avatar Mar 16 '24 16:03 vara-bonthu

We may have permanently lost existing docker images from ghcr.io/googlecloudplatform/spark-on-k8s-operator. I'm not seeing anything there anymore, it seems like they were never retagged into ghcr.io/kubeflow/spark-operator.

While trying to track down existing images to help get them migrated to the (canonical, I think) kubeflow registry mentioned in this issue, I found a few images at ghcr.io/googlecloudplatform/spark-operator, EDIT: see next comment ~~but given that the googlecloudplatform/spark-operator github repo never existed as far as I know, I'm not sure how long these images might be there for before being garbage collected. I thought ghcr registries had to match the repository names, but I could be wrong about that.~~

I also found old mentions of gcr.io/spark-operator/spark-operator, but I also don't see any old images there.

zevisert avatar Apr 19 '24 19:04 zevisert

Actually, checking out old tags from the repo shows that existing builds and charts did in-fact reference ghcr.io/googlecloudplatform/spark-operator, so I think it'd be good to retag any images from there into ghcr.io/kubeflow/spark-operator. An admin in this repo could do that with something like this:

# Using cli tools: 
# crane: https://github.com/google/go-containerregistry/blob/main/cmd/crane/doc/crane.md
# gh: https://cli.github.com/
crane auth login ghcr.io --username zevisert --password $(gh auth token);

for tag in $(crane ls ghcr.io/googlecloudplatform/spark-operator); do
   crane copy "ghcr.io/googlecloudplatform/spark-operator:$tag" "ghcr.io/kubeflow/spark-operator:$tag"; 
done

zevisert avatar Apr 19 '24 19:04 zevisert

Thanks for looking into this, @zevisert! 👍🏼

@AndrewChubatiuk, could you point us to someone in the Kubeflow org who could handle the step suggested by @zevisert?

This might be connected to issue #1991.

vara-bonthu avatar Apr 19 '24 19:04 vara-bonthu

As I said before: https://github.com/kubeflow/spark-operator/pull/1974, initially we agreed to publish operator images to the Kubeflow DockerHub. I think, right now you can publish controller image only when you make a new release because of your GitHub Actions setup.

For the old image tags, we can also push them to the new registry (docker.io/kubeflow/spark-operator) if community agreed on this. @vara-bonthu @zevisert Let's discuss it in our first Spark Operator community call ?

andreyvelich avatar Apr 19 '24 19:04 andreyvelich

Having the existing images along with any new images in one registry is something that'd help people regardless of which registry we choose long term. The script I suggested above would just need to a) login to dockerhub as well, and b) have the destination argument to crane copy changed to "docker.io/kubeflow/spark-operator:$tag"

zevisert avatar Apr 19 '24 23:04 zevisert

suppose most people will be using image defined in a helm chart, but anyway makes sense to keep them both especially due to dockerhub rate limits

AndrewChubatiuk avatar Apr 20 '24 05:04 AndrewChubatiuk

Aside: I joined the kubeflow community call on tuesday morning, but we didn't get the chance to talk about this. I don't know if the community poll has ended yet for deciding on a time for the spark-operator community call.

But, seeing that @andreyvelich can publish images to docker hub, he could fix a few of these image not found issues (#1991, #2004) by mirroring the historical docker images onto the official kubeflow docker hub as I was suggesting earlier. Users are struggling to figure out which registry has the image tag they are trying to run. Spelled out fully as of 2024-04-26T18:42Z, those tags are:

> crane ls ghcr.io/googlecloudplatform/spark-operator --full-ref
ghcr.io/googlecloudplatform/spark-operator:v1beta2-1.3.2-3.1.1
ghcr.io/googlecloudplatform/spark-operator:v1beta2-1.3.3-3.1.1
ghcr.io/googlecloudplatform/spark-operator:v1beta2-1.3.4-3.1.1
ghcr.io/googlecloudplatform/spark-operator:v1beta2-1.3.5-3.1.1
ghcr.io/googlecloudplatform/spark-operator:v1beta2-1.3.6-3.1.1
ghcr.io/googlecloudplatform/spark-operator:v1beta2-1.3.7-3.1.1
ghcr.io/googlecloudplatform/spark-operator:v1beta2-1.3.8-3.1.1

> crane ls ghcr.io/kubeflow/spark-operator --full-ref
ghcr.io/kubeflow/spark-operator:v1beta2-1.4.2-3.5.0
ghcr.io/kubeflow/spark-operator:v1beta2-1.4.3-3.5.0
ghcr.io/kubeflow/spark-operator:preview

> crane ls docker.io/kubeflow/spark-operator --full-ref
index.docker.io/kubeflow/spark-operator:v1beta2-1.4.5-3.5.0

To get all known images onto one registry, so users can easily refer to any tag, I suggest he run the following:

crane auth login docker.io --username andreyvelichkevich

crane cp ghcr.io/googlecloudplatform/spark-operator:v1beta2-1.3.2-3.1.1 docker.io/kubeflow/spark-operator:v1beta2-1.3.2-3.1.1
crane cp ghcr.io/googlecloudplatform/spark-operator:v1beta2-1.3.3-3.1.1 docker.io/kubeflow/spark-operator:v1beta2-1.3.3-3.1.1
crane cp ghcr.io/googlecloudplatform/spark-operator:v1beta2-1.3.4-3.1.1 docker.io/kubeflow/spark-operator:v1beta2-1.3.4-3.1.1
crane cp ghcr.io/googlecloudplatform/spark-operator:v1beta2-1.3.5-3.1.1 docker.io/kubeflow/spark-operator:v1beta2-1.3.5-3.1.1
crane cp ghcr.io/googlecloudplatform/spark-operator:v1beta2-1.3.6-3.1.1 docker.io/kubeflow/spark-operator:v1beta2-1.3.6-3.1.1
crane cp ghcr.io/googlecloudplatform/spark-operator:v1beta2-1.3.7-3.1.1 docker.io/kubeflow/spark-operator:v1beta2-1.3.7-3.1.1
crane cp ghcr.io/googlecloudplatform/spark-operator:v1beta2-1.3.8-3.1.1 docker.io/kubeflow/spark-operator:v1beta2-1.3.8-3.1.1
crane cp ghcr.io/kubeflow/spark-operator:v1beta2-1.4.2-3.5.0            docker.io/kubeflow/spark-operator:v1beta2-1.4.2-3.5.0
crane cp ghcr.io/kubeflow/spark-operator:v1beta2-1.4.3-3.5.0            docker.io/kubeflow/spark-operator:v1beta2-1.4.3-3.5.0
crane cp ghcr.io/kubeflow/spark-operator:preview                        docker.io/kubeflow/spark-operator:preview  # optional

zevisert avatar Apr 26 '24 18:04 zevisert

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Jul 25 '24 20:07 github-actions[bot]

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

github-actions[bot] avatar Aug 14 '24 22:08 github-actions[bot]