kpt-config-sync
kpt-config-sync copied to clipboard
`Error: failed to download oci://localhost:5001/test-chart` with Kind and local registry
I'm trying to sync an Helm chart with Config Sync installed in Kind with a local registry, but I'm getting this error on my RootSync
:
KNV2004: unable to sync repo
Error in the helm-sync container: {"Msg":"unexpected error rendering chart, will retry","Err":"failed to render the helm chart: exit status 1, stdout: Error: failed to download \"oci://localhost:5001/test-chart\" at version \"1.0.0\"\n","Args":{}}
Here is my setup to reproduce:
## Set up Kind with local registry
reg_name='kind-registry'
reg_port='5001'
reg_internal_port='5000'
docker run \
-d --restart=always -p "127.0.0.1:${reg_port}:${reg_internal_port}" --name "${reg_name}" \
registry:2
cat <<EOF | ./kind create cluster --image kindest/node:v1.24.6 --config=-
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
containerdConfigPatches:
- |-
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."localhost:${reg_port}"]
endpoint = ["http://${reg_name}:${reg_internal_port}"]
EOF
docker network connect "kind" "${reg_name}"
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
name: local-registry-hosting
namespace: kube-public
data:
localRegistryHosting.v1: |
host: "localhost:${reg_port}"
hostFromContainerRuntime: "${reg_name}:${reg_internal_port}"
hostFromClusterNetwork: "${reg_name}:${reg_internal_port}"
help: "https://kind.sigs.k8s.io/docs/user/local-registry/"
EOF
## Install CS
kubectl apply -f https://github.com/GoogleContainerTools/kpt-config-sync/releases/download/v1.13.0/config-sync-manifest.yaml
## Create local Helm chart
helm create test-chart
helm package test-chart --version 1.0.0
helm push test-chart-1.0.0.tgz oci://localhost:${reg_port}
## Confirming that I can successfully pull the Helm chart from the local registry
helm pull oci://localhost:${reg_port}/test-chart --version 1.0.0
## Sync local Helm chart
cat << EOF | kubectl apply -f -
apiVersion: configsync.gke.io/v1beta1
kind: RootSync
metadata:
name: root-sync
namespace: config-management-system
spec:
sourceFormat: unstructured
sourceType: helm
helm:
repo: oci://localhost:${reg_port}
chart: test-chart
version: 1.0.0
releaseName: test-chart
auth: none
EOF
JFYI, if I don't use the local registry setup, but instead use a public Helm chart in GHCR, it's working successfully:
cat << EOF | kubectl apply -f -
apiVersion: configsync.gke.io/v1beta1
kind: RootSync
metadata:
name: root-sync
namespace: config-management-system
spec:
sourceFormat: unstructured
sourceType: helm
helm:
repo: oci://ghcr.io/mathieu-benoit
chart: my-chart
version: 0.1.0
releaseName: my-chart
auth: none
EOF
Also, I confirm that this docker
flow is working successfully with this setup too:
docker pull gcr.io/google-samples/hello-app:1.0
docker tag gcr.io/google-samples/hello-app:1.0 localhost:${reg_port}/hello-app:1.0
docker push localhost:${reg_port}/hello-app:1.0
kubectl create deployment hello-server --image=localhost:${reg_port}/hello-app:1.0
Not sure if the error is coming from the Kind setup or from Config Sync, so logging this here. CC: @nan-yu @xinnywinne
In addition to the helm
scenario explained in the main description of this issue, I just gave a try to the oci
format, and I'm also getting an error, see steps to reproduce it too:
## Build the OCI artifact
cat <<EOF> test-namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: test
EOF
tar -cf test-namespace.tar test-namespace.yaml
oras push \
localhost:${reg_port}/test-namespace:v1 \
test-namespace.tar
## Confirming that I can successfully pull this OCI artifact from local registry
oras pull localhost:${reg_port}/test-namespace:v1
## Sync this OCI artifact with Config Sync
cat << EOF | kubectl apply -f -
apiVersion: configsync.gke.io/v1beta1
kind: RootSync
metadata:
name: root-sync
namespace: config-management-system
spec:
sourceFormat: unstructured
sourceType: oci
oci:
image: localhost:${reg_port}/test-namespace:v1
auth: none
EOF
Error on the RootSync
:
errors:
- code: "2004"
errorMessage: |-
KNV2004: unable to sync repo
Error in the oci-sync container: {"Msg":"unexpected error fetching package, will retry","Err":"failed to pull image localhost:5001/test-namespace:v1: Get \"https://localhost:5001/v2/\": dial tcp [::1]:5001: connect: connection refused; Get \"http://localhost:5001/v2/\": dial tcp [::1]:5001: connect: connection refused","Args":{}}
For more information, see https://g.co/cloud/acm-errors#knv2004
lastUpdate: "2022-10-02T14:58:57Z"
ociStatus:
dir: .
image: localhost:5001/test-namespace:v1
I expect this has to do with the details of local kind registry rather than a bug in helm/oci.
From https://kind.sigs.k8s.io/docs/user/local-registry/#using-the-registry
If you build your own image and tag it like localhost:5001/image:foo and then use it in kubernetes as localhost:5001/image:foo. And use it from inside of your cluster application as kind-registry:5000.
Have you tried using kind-registry
as the registry name?
If that doesn't work - For our local/kind e2e testing we spin up a git server as a service in the cluster. We don't have any local/kind e2e testing for oci/helm yet, but the way I would probably go about that is to spin up the registry as a service in the cluster as well.
Hi @sdowell, to be honest I think I would like to see what could be seen in the code of both paths oci
and helm
, different implementation and different errors.
When I'm looking at this https://github.com/stefanprodan/flux-local-dev, it seems that they are able to deploy their OCI artifacts with the exact same setup as mine. I'm trying to see if there is any differences or something I'm missing. For exampe, I see that in their OCIRepository
resource, they have insecure: true
, which, AFAIK, we don't have, but I don't know if it could be related to the 2 issues I'm facing anyway.
And I think this setup could be a good win for the CI/e2e tests of CS with Helm/OCI too, as soon as we get this working?
Back to your suggestion:
using
kind-registry
as the registry name?
What do you mean? In the name of the image when doing push
/pull
, etc.?
Our kind tests run in parallel on multiple clusters so for our e2e testing use case we would probably want to isolate the registry for each cluster anyways.
From kind's documentation it sounds like kind-registry:5000
should be used from inside the cluster instead of localhost:5001
. It also looks like they use kind-registry:5000
in the repo you linked to.
Gotcha, good catch, I will try that soon and will report back here, thanks @sdowell. Something around:
cat << EOF | kubectl apply -f -
apiVersion: configsync.gke.io/v1beta1
kind: RootSync
metadata:
name: root-sync
namespace: config-management-system
spec:
sourceFormat: unstructured
sourceType: oci
oci:
image: kind-registry:5000/test-namespace:v1
auth: none
EOF
Same for Helm.
Hey @sdowell, following up on this, I just did some tests with helm.repo: oci://${reg_name}:${reg_internal_port}
and oci.image: ${reg_name}:${reg_internal_port}
in RootSyncs
, but still getting same errors.
For oci
, I get another error message, with oci.image: ${reg_name}:${reg_internal_port}
:
failed to pull image kind-registry:5000/test-namespace:v1: Get https://kind-registry:5000/v2/: http: server gave HTTP response to HTTPS client
And with oci.image: localhost:${reg_port}
it was:
failed to pull image localhost:5001/test-namespace:v1: Get https://localhost:5001/v2/: dial tcp [::1]:5001: connect: connection refused; Get \"http://localhost:5001/v2/
For helm
in both cases, still the same error:
failed to render the helm chart: exit status 1, stdout: Error: failed to download oci://localhost:5001/test-chart at version 1.0.0
failed to pull image kind-registry:5000/test-namespace:v1: Get https://kind-registry:5000/v2/: http: server gave HTTP response to HTTPS client
It appears this has to do with the insecure
flag that you referenced above. Looking into the OCI library that we use for fetching images, the transport only falls back on http for localhost or if the insecure flag is set.
I don't think our API currently supports toggling that flag and it's not currently planned to add support for that in the API. I expect it's a similar scenario for helm but it's just returning a generic error.
cc @nan-yu @xinnywinne
Hi @mathieu-benoit, I will look into this for helm support and let you know. Thanks @sdowell for helping do the initial analysis.
I confirmed that the insecure
flag is the root cause of the OCI failure. After adding the flag, the updated oci-sync
container is able to sync the image from image: kind-registry:5000/test-namespace:v1
. We're aware of this issue and have created an internal bug to track it.
Hi @mathieu-benoit, I am able to reproduce the Helm failure. When I debug it, I see the error message failed to do request: Head \"https://kind-registry:5000/v2/test-chart/manifests/0.1.0\": http: server gave HTTP response to HTTPS client
, and here is an open issue relate to this: https://github.com/helm/helm/issues/6324. Currently, helm still not support pull from an insecure registry directly.
Thanks @nan-yu for tracking the issue internally for OCI.
Thanks @xinnywinne for diagnosing the issue with Helm. The link you shared is about the helm push
command issue, but I don't have an issue with the push
. On the other hand, I see that the helm pull
command as this --insecure-skip-tls-verify
parameter, do you think that could help?
helm pull/push/template work with oci://localhost:5001
but not oci://kind-registry:5000
. I have tried helm template --insecure-skip-tls-verify
, it does not help. Here is the most recent open issue: https://github.com/helm/helm/issues/11352. There is an open PR https://github.com/helm/helm/pull/9564 mentioned in this issue that relate to your question.