spark-operator
spark-operator copied to clipboard
[BUG] Failed to pull image "ghcr.io/kubeflow/spark-operator:v1beta2-1.3.3-3.1.1
Description
Unable to Start spark job in kubenetes
- [*] ✋ I have searched the open/closed issues and my issue is not listed.
Reproduction Code [Required]
Steps to reproduce the behavior:
- Set up a new kubenetes cluster. I set up one in gcloud.
- Get kubenetes cluster config
- helm repo add spark-operator https://kubeflow.github.io/spark-operator
- helm install spark-operator spark-operator/spark-operator
--namespace default
--set 'image.tag=v1beta2-1.3.3-3.1.1'
--set sparkJobNamespace=default
Expected behavior
Spin up the spark operator pod.
Actual behavior
Pod failed because of ImagePullBackOff
Saw the following error.
Failed to pull image "ghcr.io/kubeflow/spark-operator:v1beta2-1.3.3-3.1.1": rpc error: code = NotFound desc = failed to pull and unpack image "ghcr.io/kubeflow/spark-operator:v1beta2-1.3.3-3.1.1": failed to resolve reference "ghcr.io/kubeflow/spark-operator:v1beta2-1.3.3-3.1.1": ghcr.io/kubeflow/spark-operator:v1beta2-1.3.3-3.1.1: not found
The errors start at 04/13/2024 1:00 AM
Terminal Output Screenshot(s)
Environment & Versions
- Spark Operator App version:3.1.1
- Helm Chart Version: v3.12.3
- Kubernetes Version: v1.28.7-gke.1026000
- Apache Spark version:
Additional context
I checked https://github.com/kubeflow/spark-operator/pkgs/container/spark-operator It looks like we didn't publish version v1beta2-1.3.3-3.1.1 at all.
@yuchaoran2011 Can you push this version to fix the issue? Thanks
Root cause is https://github.com/kubeflow/spark-operator/pull/1937
/kind bug
@vzhao12 Until this is addressed, you can use images from the old registry by invoking helm with an extra option
--set 'image.repository=ghcr.io/googlecloudplatform/spark-operator'
@vzhao12 I am still getting imagepullbackoff error. does anyone have idea? helm install my-release spark-operator/spark-operator --namespace spark-operator --create-namespace --set 'image.repository=ghcr.io/googlecloudplatform/spark-operator' I am using this command
use 'image.repository=ghcr.io/kubeflow/spark-operator' and 'image.tag=v1beta2-1.4.3-3.5.0'
We just released a new image update with important registry fixes. Check it out:
Image tag: https://github.com/kubeflow/spark-operator/tree/v1beta2-1.4.5-3.5.0 Helm chart: https://github.com/kubeflow/spark-operator/releases/tag/spark-operator-chart-1.2.14
Please give it a try and let us know if you encounter any issues. We're working on a new KubeFlow Spark Operator release and your testing will help make it stable! Feel free to share feedback on the Kubeflow Spark operator channel.
@vara-bonthu Users will still need to --set=image.repository=...
if they are using any tag other than v1beta2-1.4.5-3.5.0
since previous docker images have not yet been replicated to the chart's default repository (docker.io/kubeflow/spark-operator
).
Still only one tag exists in the default container registry: https://hub.docker.com/r/kubeflow/spark-operator/tags
Edit: Changed tag to match @RyanZotti's comment
I think you meant any tag other than v1beta2-1.4.5-3.5.0
. The 1.4.3 version isn't available but 1.4.5 is.
This issue has been automatically marked as stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days. Thank you for your contributions.