serving
serving copied to clipboard
AutoTLS with cert-manager creating kcert, but not cert
Currently trying to set up knative servicing AutoTLS on a bare metal cluster. The cluster has an existing cert-manager and ClusterIssuer, which has previously been used to generate certificates for services, showing that the issuer is working as expected. After installing knative using the operator, I successfully set up a service without TLS, which suggests that knative is also working as expected.
Once AutoTLS is enabled, the knative service shows a state of "unknown" with the reason "CertificateNotReady". I can see a knative certificate has been created, but it's "ready", "reason", and "events" are all empty. From looking at other issues, I can see that a cert-manager certificate should be created, though this doesn't appear to be happening.
kubectl describe knativeservings.operator.knative.dev -n knative-serving knative-serving
Name: knative-serving
Namespace: knative-serving
Labels: networking.knative.dev/certificate-provider=cert-manager
Annotations: <none>
API Version: operator.knative.dev/v1beta1
Kind: KnativeServing
Metadata:
Creation Timestamp: 2022-04-21T21:10:21Z
Finalizers:
knativeservings.operator.knative.dev
Generation: 8
Resource Version: 106678848
UID: 2d438f12-4873-4c49-9df4-a0afc8453c5e
Spec:
Config:
Autoscaler:
Enable - Scale - To - Zero: true
Certmanager:
Issuer Ref: - kind: ClusterIssuer
- name: letsencrypt
Domain:
knative.my-domain-here.com:
Network:
Auto - Tls: Enabled
Http - Protocol: Redirected
Ingress - Class: kourier.ingress.networking.knative.dev
Controller - Custom - Certs:
Name:
Type:
Ingress:
Contour:
Enabled: false
Istio:
Enabled: false
Kourier:
Enabled: true
Registry:
Status:
Conditions:
Last Transition Time: 2022-04-21T21:45:09Z
Status: True
Type: DependenciesInstalled
Last Transition Time: 2022-04-21T21:45:43Z
Status: True
Type: DeploymentsAvailable
Last Transition Time: 2022-04-21T21:45:09Z
Status: True
Type: InstallSucceeded
Last Transition Time: 2022-04-21T21:45:43Z
Status: True
Type: Ready
Last Transition Time: 2022-04-21T21:10:21Z
Status: True
Type: VersionMigrationEligible
Manifests:
/var/run/ko/knative-serving/1.3.1
Observed Generation: 8
Version: 1.3.1
Events: <none>
kubectl describe ksvc -n earthwalker earthwalker
Name: earthwalker
Namespace: earthwalker
Labels: <none>
Annotations: networking.knative.dev/ingress.class: kourier.ingress.networking.knative.dev
scale-to-zero-grace-period: 300s
serving.knative.dev/creator: kubernetes-admin
serving.knative.dev/lastModifier: kubernetes-admin
API Version: serving.knative.dev/v1
Kind: Service
Metadata:
Creation Timestamp: 2022-04-22T20:50:20Z
Generation: 1
Resource Version: 106680182
UID: c5640172-7fd3-47ae-8881-7f1718489015
Spec:
Template:
Metadata:
Annotations:
autoscaling.knative.dev/target: 200
networking.knative.dev/ingress.class: kourier.ingress.networking.knative.dev
Scale - To - Zero - Grace - Period: 300s
Creation Timestamp: <nil>
Labels:
Deployment: earthwalker
Name: earthwalker-svc
Namespace: earthwalker
Spec:
Container Concurrency: 0
Containers:
Env:
Name: EARTHWALKER_CONFIG_PATH
Value: /config/config.toml
Name: EARTHWALKER_PORT
Value: 8080
Image: registry.gitlab.com/glatteis/earthwalker:latest
Name: earthwalker
Ports:
Container Port: 8080
Protocol: TCP
Readiness Probe:
Success Threshold: 1
Tcp Socket:
Port: 0
Resources:
Limits:
Cpu: 500m
Memory: 128Mi
Requests:
Cpu: 10m
Memory: 64Mi
Volume Mounts:
Mount Path: /config
Name: earthwalker-config
Read Only: true
Sub Path: config.toml
Enable Service Links: false
Timeout Seconds: 300
Volumes:
Config Map:
Items:
Key: config.toml
Path: config.toml
Name: earthwalker-config
Name: earthwalker-config
Traffic:
Latest Revision: true
Percent: 100
Status:
Address:
URL: http://earthwalker.earthwalker.svc.cluster.local
Conditions:
Last Transition Time: 2022-04-22T20:50:28Z
Status: True
Type: ConfigurationsReady
Last Transition Time: 2022-04-22T20:50:29Z
Message: Certificate route-262671d3-9e58-45e7-b769-13e2291324e8 is not ready.
Reason: CertificateNotReady
Status: Unknown
Type: Ready
Last Transition Time: 2022-04-22T20:50:29Z
Message: Certificate route-262671d3-9e58-45e7-b769-13e2291324e8 is not ready.
Reason: CertificateNotReady
Status: Unknown
Type: RoutesReady
Latest Created Revision Name: earthwalker-svc
Latest Ready Revision Name: earthwalker-svc
Observed Generation: 1
Traffic:
Latest Revision: true
Percent: 100
Revision Name: earthwalker-svc
URL: https://earthwalker.earthwalker.knative.my-domain-here.com
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Created 32m service-controller Created Configuration "earthwalker"
Normal Created 32m service-controller Created Route "earthwalker"
kubectl describe kcert -n earthwalker route-262671d3-9e58-45e7-b769-13e2291324e8
Name: route-262671d3-9e58-45e7-b769-13e2291324e8
Namespace: earthwalker
Labels: serving.knative.dev/route=earthwalker
Annotations: networking.knative.dev/certificate.class: cert-manager.certificate.networking.knative.dev
networking.knative.dev/ingress.class: kourier.ingress.networking.knative.dev
scale-to-zero-grace-period: 300s
serving.knative.dev/creator: kubernetes-admin
serving.knative.dev/lastModifier: kubernetes-admin
API Version: networking.internal.knative.dev/v1alpha1
Kind: Certificate
Metadata:
Creation Timestamp: 2022-04-22T21:11:23Z
Generation: 1
Owner References:
API Version: serving.knative.dev/v1
Block Owner Deletion: true
Controller: true
Kind: Route
Name: earthwalker
UID: 262671d3-9e58-45e7-b769-13e2291324e8
Resource Version: 106686685
UID: 4e9b0b8f-873b-4e73-a65e-bf28d0e36eed
Spec:
Dns Names:
earthwalker.earthwalker.knative.my-domain-here.com
Secret Name: route-262671d3-9e58-45e7-b769-13e2291324e8
Events: <none>
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen
. Mark the issue as
fresh by adding the comment /remove-lifecycle stale
.
/remove-lifecycle stale
Same here version 1.7, did you ever solve this issue?
Unfortunately not, which is a pity as knative looks like a good fit for my use case but fairly useless if it doesn't work.
Assuming you followed all the steps here, right? And the cert-manager on your cluster is v1.0+?
I can confirm I have:
- knative-operator 1.7.0
- knative-serving 1.7.1
- kourier 1.7.0
- cert-manager 1.9.1
- knative configured with a custom domain, which has a wildcard A record set to knative's ingress IP
- cert-manager configured for HTTP-01 validations, and confirmed working with the haproxy ingress
I believe I may have located the source of the problem. Despite having configured the ClusterIssuer letsencrypt in the KnativeServing resource, the config-certmanager configmap still has an example configuration.
$ kubectl get knativeservings.operator.knative.dev -n knative-serving knative-serving -o yaml
apiVersion: operator.knative.dev/v1beta1
kind: KnativeServing
metadata:
finalizers:
- knativeservings.operator.knative.dev
labels:
networking.knative.dev/certificate-provider: cert-manager
name: knative-serving
[...]
spec:
config:
certmanager:
issuerRef: |
kind: ClusterIssuer
name: letsencrypt
[...]
$ kubectl describe configmaps -n knative-serving config-certmanager
Name: config-certmanager
Namespace: knative-serving
Labels: app.kubernetes.io/component=net-certmanager
app.kubernetes.io/name=knative-serving
app.kubernetes.io/version=1.7.0
networking.knative.dev/certificate-provider=cert-manager
Annotations: <none>
Data
====
_example:
----
################################
# #
# EXAMPLE CONFIGURATION #
# #
################################
# This block is not actually functional configuration,
# but serves to illustrate the available configuration
# options and document them in a way that is accessible
# to users that `kubectl edit` this config map.
#
# These sample configuration options may be copied out of
# this block and unindented to actually change the configuration.
# issuerRef is a reference to the issuer for this certificate.
# IssuerRef should be either `ClusterIssuer` or `Issuer`.
# Please refer `IssuerRef` in https://github.com/cert-manager/cert-manager/tree/master/pkg/apis/certmanager/v1/types_certificate.go
# for more details about IssuerRef configuration.
issuerRef: |
kind: ClusterIssuer
name: letsencrypt-issuer
BinaryData
====
Events: <none>
Looks like this is being caused by a permissions issue, is there a way to check the operator logs to see why policies aren't getting set up correctly?
W0909 21:59:29.092786 1 reflector.go:324] k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1.Secret: Unauthorized
E0909 21:59:29.092821 1 reflector.go:138] k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Secret: failed to list *v1.Secret: Unauthorized
W0909 21:59:33.456959 1 reflector.go:324] k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1.ConfigMap: Unauthorized
E0909 21:59:33.457001 1 reflector.go:138] k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: Unauthorized
E0909 21:59:36.207204 1 leaderelection.go:330] error retrieving resource lock knative-serving/net-certmanager-webhook.configmapwebhook.00-of-01: Unauthorized
E0909 21:59:41.279851 1 leaderelection.go:330] error retrieving resource lock knative-serving/net-certmanager-webhook.webhookcertificates.00-of-01: Unauthorized
E0909 21:59:56.144964 1 leaderelection.go:330] error retrieving resource lock knative-serving/net-certmanager-webhook.configmapwebhook.00-of-01: Unauthorized
Ok so found another issue that suggests the permissions need to be set up manually. Did that by setting up a custom role binding
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: knative-serving-certmanager
labels:
operator.knative.dev/release: "v1.7.0"
app.kubernetes.io/version: "1.7.0"
app.kubernetes.io/part-of: knative-operator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: knative-serving-certmanager
subjects:
- kind: ServiceAccount
name: controller
namespace: knative-serving
Having done that, stuff is definitely getting further but certificates still aren't being correctly generated. I managed to find some log entries from the kourier controller that the solver services are missing, but no logs from anything else on trying/failing to create them.
{"severity":"WARNING","timestamp":"2022-09-10T11:57:58.637513758Z","logger":"net-kourier-controller","caller":"generator/ingress_translator.go:137","message":"Service 'earthwalker/cm-acme-http-solver-6422x' not yet created","commit":"09b107b-dirty","knative.dev/controller":"knative.dev.net-kourier.pkg.reconciler.ingress.Reconciler","knative.dev/kind":"networking.internal.knative.dev.Ingress","knative.dev/traceid":"51ab5d29-e64a-40a9-b11c-03a03ca0ddea","knative.dev/key":"earthwalker/earthwalker"}
How did you install net-certmanager? (Please note that cert-manager is different from Knative's net-certmanager).
operator does not have an option to install net-certmanager so you need to install it manually like the following command or spec.additionalManifests
as described in https://github.com/knative/operator/issues/950.
$ kubectl apply -f https://github.com/knative/net-certmanager/releases/download/knative-v1.7.0/release.yaml
From a kustomization.yaml file:
bases:
- https://github.com/knative/operator/releases/download/knative-v1.7.0/operator.yaml
- https://github.com/knative/net-certmanager/releases/download/knative-v1.7.0/release.yaml
resources:
- roleBinding.yaml
I can see that both the net-certmanager-controller
and net-certmanager-webhook
deployments exist and are running 1 pod each.
Thank you. Hmm.... I tested the autoTLS if I can reproduce it or not but it works without any issue like permission. I am sharing the steps below I did, so could you double-check if there is any steps you missed? I think you are doing correct, though.
1.Deploy operator
export VERSION=knative-v1.7.0
kubectl apply -f https://github.com/knative/operator/releases/download/$VERSION/operator.yaml
kubectl wait deploy --all --for=condition=Available
2.Deploy knative serving
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Namespace
metadata:
name: knative-serving
---
apiVersion: operator.knative.dev/v1beta1
kind: KnativeServing
metadata:
name: knative-serving
namespace: knative-serving
spec:
ingress:
kourier:
enabled: true
config:
network:
auto-tls: Enabled
http-protocol: Redirected
ingress-class: kourier.ingress.networking.knative.dev
EOF
kubectl wait deploy --all --for=condition=Available -n knative-serving
3.Deploy net-certmanager
kubectl apply --filename https://github.com/knative/net-certmanager/releases/download/$VERSION/release.yaml
4.Deploy cert-manager
export SERVING_REPO=${GOPATH}/src/knative.dev/serving
kubectl apply -f ${SERVING_REPO}/third_party/cert-manager-latest/
kubectl wait deploy --all --for=condition=Available -n knative-serving -n cert-manager
NOTE: ${GOPATH}/src/knative.dev/serving
is this knative/serving repo.
5.Deploy caissue
kubectl apply -f ${SERVING_REPO}/test/config/autotls/certmanager/caissuer/
EDIT kubectl patch cm config-network -n "knative-serving" -p '{"data":{"autoTLS":"Enabled"}}'
is not good way for oprator. I re-tested the configuration in KnativeServing CR and updated the instructions.
6.Deploy ksvc & verify the autoTLS
kn service create hello-example --image=gcr.io/knative-samples/helloworld-go
$ kubectl get ksvc
NAME URL LATESTCREATED LATESTREADY READY REASON
hello-example https://hello-example.default.example.com hello-example-00001 hello-example-00001 True
$ kubectl get kcert
NAME READY REASON
route-e2d9d6a1-8601-4d58-8952-62f6229d13f2 True
$ kubectl get cert
NAME READY SECRET AGE
route-e2d9d6a1-8601-4d58-8952-62f6229d13f2 True route-e2d9d6a1-8601-4d58-8952-62f6229d13f2 7m
I believe I may have located the source of the problem. Despite having configured the ClusterIssuer letsencrypt in the KnativeServing resource, the config-certmanager configmap still has an example configuration.
For the issue config-certmanager
you mentioned above, the config-certmanager
(including in net-certmanager
) is not deployed by operator so you need to configure it directory rather than KnativeServing CR.
Having manually deployed the config-certmanager
, HTTP-01 verification still has the same issue though I can now get certificates using DNS-01 verification.
I'm also facing another problem getting this to work on a user-friendly domain instead of the service.namespace.mydomain.tld
one. Specifically, the DomainMapping remains unready with a " Waiting for load balancer to be ready" message. Trying to connect to the service with this error shows the following HTTP response, which I think I had at one point while trying to preview the HTTP-01 verification URL. Could this be related?
upstream connect error or disconnect/reset before headers. reset reason: connection failure
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen
. Mark the issue as
fresh by adding the comment /remove-lifecycle stale
.
This issue or pull request is stale because it has been open for 90 days with no activity.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle rotten
- Close this issue or PR with
/close
/lifecycle stale
/remove-lifecycle rotten
/reopen
Just quickly hopping in to let you know I also had this issue. If you remove the
http: redirected
from the configmap, certmanager is able to issue the certificates. If I enable it the 301 will break the cert issuing.
So if you leave it on default your Services will listen on 80 and 443. You have to enforce the https redirect somewhere else.
Hope this helps. BR
E: As soon as the certs are ready you can add the redirect again which will break the cert update mechanism of the certmanager.
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen
. Mark the issue as
fresh by adding the comment /remove-lifecycle stale
.