cluster-api-provider-vsphere
cluster-api-provider-vsphere copied to clipboard
CAPV CSI driver isn't passing TLS thumbprint
/kind bug
What steps did you take and what happened:
I have a largely stock configuration and the vsphere-csi-controller
is in CrashLoopBackOff. Reviewing the logs I am seeing this:
{"level":"error","time":"2021-03-29T20:53:48.890658758Z","caller":"service/service.go:135","msg":"failed to init controller. Error: Post https://e4vmw0vic06.datalinklabs.local:443/sdk: x509: certificate signed by unknown authority","TraceId":"bb6ce13f-5059-40aa-948d-abe39ee9ceeb","stacktrace":"sigs.k8s.io/vsphere-csi-driver/pkg/csi/service.(*service).BeforeServe\n\t/build/pkg/csi/service/service.go:135\ngithub.com/rexray/gocsi.(*StoragePlugin).Serve.func1\n\t/go/pkg/mod/github.com/rexray/[email protected]/gocsi.go:246\nsync.(*Once).doSlow\n\t/usr/local/go/src/sync/once.go:66\nsync.(*Once).Do\n\t/usr/local/go/src/sync/once.go:57\ngithub.com/rexray/gocsi.(*StoragePlugin).Serve\n\t/go/pkg/mod/github.com/rexray/[email protected]/gocsi.go:211\ngithub.com/rexray/gocsi.Run\n\t/go/pkg/mod/github.com/rexray/[email protected]/gocsi.go:130\nmain.main\n\t/build/cmd/vsphere-csi/main.go:64\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:203"}
{"level":"info","time":"2021-03-29T20:53:48.890709507Z","caller":"service/service.go:110","msg":"configured: \"csi.vsphere.vmware.com\" with clusterFlavor: \"VANILLA\" and mode: \"controller\"","TraceId":"bb6ce13f-5059-40aa-948d-abe39ee9ceeb"}
time="2021-03-29T20:53:48Z" level=info msg="removed sock file" path=/var/lib/csi/sockets/pluginproxy/csi.sock
time="2021-03-29T20:53:48Z" level=fatal msg="grpc failed" error="Post https://e4vmw0vic06.datalinklabs.local:443/sdk: x509: certificate signed by unknown authority"
I am providing the TLS fingerprint via the clusterctl
configuration. I did notice the secret/csi-vsphere-config
object doesn't include the fingerprint key/value pair.
What did you expect to happen: I'd expect it to either use the TLS fingerprint or give me an option to accept insecure certificates.
Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]
Environment:
- Cluster-api-provider-vsphere version:
- Kubernetes version: (use
kubectl version
): - OS (e.g. from
/etc/os-release
):
/assign
As a workaround you can add insecure-flag = true
to the [Global]
section of the csi-vsphere.conf
file of the Secret:csi-vsphere-config
resource
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/lifecycle frozen
It looks like this is the problematic code that's responsible for generating the default flavor template: https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/blob/0679b6187bc6b13e96265a3e4ee778ea8086e9bc/packaging/flavorgen/flavors/crs/csi.go#L129
It should be extended so the Global
section looks like this:
[Global]
cluster-id = "${NAMESPACE}/${CLUSTER_NAME}"
insecure-flag = false
thumbprint = "${VSPHERE_TLS_THUMBPRINT}"
With that config I can verify that the CSI integration works as expected and you can review the details of the two new properties, insecure-flag
and thumbprint
, in the vSphere CSI docs. I'd patch it myself but my Go knowledge is zero and I don't want to mess things up. Hope this piece of information helps to write a quick fix since this looks like 2 lines of code to me.
If patching this it would make sense to add a option to add the ca-file
property as well in case someone wants to provide certificates instead of only validating thumbprints (altought SHA256 collisions seem unlikely today, a tomorrow exists).
given the original assignee didn't respond on the original PR I think we can proceed with https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/pull/1220
right, my bad. I totally missed there was a PR in a duplicated issue for that already.
/unassign /assign @scdubey
@srm09: GitHub didn't allow me to assign the following users: scdubey.
Note that only kubernetes-sigs members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. For more information please see the contributor guide
In response to this:
/unassign /assign @scdubey
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Has there been any more work done on this? The issue is obviously still open and the two PRs referencing it look like they were closed without being merged. I would potentially be interested in working on this if there isn't anyone else looking into it.
@EdgeJ Have you made any progress on this?
Best Patrick
Hi @PatrickLaabs I never got a response to my comment here, so I never looked into it. Now I no longer have access to (or work with) vsphere, so I have no plans to work on this anymore.
@EdgeJ Apologies for missing the comment on this one. I hope we can have further contributions from you sometime in the future.
This is still happening.