edge-stack icon indicating copy to clipboard operation
edge-stack copied to clipboard

Issue with Lets Encrypt and Host CRD

Open davor2klin opened this issue 3 years ago • 3 comments

I have AWS EKS behind AWS Load Balancer Lets Encrypt doesn't work at all with AWS NLB, and by using AWS Classic Load Balancer, i am able to register ONLY one host, after that for every other host i am getting the error ACME 403 Unauthenticated this same error i am getting also when i use AWS NLB error:

obtaining tlsSecret "test1.mydomain.com"."ambassador"
    (hostnames=["test1.mydomain.com"]): acme: Error -> One or more domains had
    a problem:

    [test1.mydomain.com] acme: error: 403 ::
    urn:ietf:params:acme:error:unauthorized :: Invalid response from
    http://test1.mydomain.com/.well-known/acme-challenge/NM0XccervQ1Ldjm-50dsdf2F5qrZ2fdfsXqjyiuvium0V-tI

 authority: https://acme-v02.api.letsencrypt.org/directory

The single validated host (test.mydomain.com), with AWS Classic Load Balancer, is reachable and doesn't have any other issue Setup:

apiVersion: getambassador.io/v3alpha1
kind: Host
metadata:
  name: test
  namespace: ambassador  
spec:
  hostname: "test.mydomain.com"
  acmeProvider:
    email: [email protected]
    authority: https://acme-v02.api.letsencrypt.org/directory
  requestPolicy:
    insecure:
      action: Redirect
      additionalPort: 8080
---
apiVersion: getambassador.io/v3alpha1
kind: Mapping
metadata:
  name: test
  namespace: ambassador
spec:
  host: "test.mydomain.com"
  prefix: "/"
  service: "nginx.default:80" 
---
apiVersion: getambassador.io/v3alpha1
kind: Host
metadata:
  name: test1
  namespace: ambassador  
spec:
  hostname: "test1.mydomain.com"
  acmeProvider:
    email: [email protected]
    authority: https://acme-v02.api.letsencrypt.org/directory
  requestPolicy:
    insecure:
      action: Redirect
      additionalPort: 8080
---
apiVersion: getambassador.io/v3alpha1
kind: Mapping
metadata:
  name: test1
  namespace: ambassador
spec:
  host: "test1.mydomain.com"
  prefix: "/"
  service: "nginx1.default:80" 

EKS 1.21 (newly created - Edge Stack is the first resource) Edge-stack 2.0.5

davor2klin avatar Feb 06 '22 22:02 davor2klin

I have the same issue with GCP. Even the first host fails some times

MatTerra avatar Feb 10 '22 13:02 MatTerra

Hi there,

This is a known issue that will be solved in a future release. Currently when there are no Host objects present, Edge-Stack uses a synthetic host with a self-signed certificate so that it can respond to requests. After you create your first Host, this synthetic Host goes away. Without a wildcard host to respond to the ACME challenge then all Hosts making use of ACME will fail after the first one is created.

Unfortunately this information was not more visible in our documentation. We've added information about this issue and workaround to all of the ACME documents now until it is resolved in a release.

You can follow this document to quickly get started with your own wildcard host and self-signed certificate.

Thanks for bringing this to our attention and please let us know if you have any other issues getting started with this.

Alice-Lilith avatar Feb 10 '22 21:02 Alice-Lilith

Thank you for your help @AliceProxy. However, I still can't use ACME cert management... I've applied a Listener, a wildcard host and then tried to generate the host with ACME. The error changed, but it still didn't complete... Below are the configuration files I used. tls-cert is a secret created as demonstrated in the link you've sent.

apiVersion: getambassador.io/v3alpha1
kind: Listener
metadata:
  name: edge-stack-listener-8443
  namespace: ambassador
spec:
  port: 8443
  protocol: HTTPS
  securityModel: XFP
  hostBinding:
    namespace:
      from: ALL
---
apiVersion: getambassador.io/v3alpha1
kind: Host
metadata:
  name: wildcard-host
spec:
  hostname: "*"
  acmeProvider:
    authority: none
  tlsSecret:
    name: tls-cert
---
apiVersion: getambassador.io/v3alpha1
kind: Host
metadata:
  name: grafana-host
spec:
  hostname: "mylink.example" # Just replaced to post
  acmeProvider:
    email: "[email protected]"

These are the events in kubectl describe grafana-host

Events:
  Type     Reason   Age                From                   Message
  ----     ------   ----               ----                   -------
  Normal   Pending  29s                Ambassador Edge Stack  waiting for Host DefaultsFilled change to be reflected in snapshot
  Normal   Pending  29s                Ambassador Edge Stack  creating private key Secret
  Normal   Pending  29s                Ambassador Edge Stack  waiting for private key Secret creation to be reflected in snapshot
  Normal   Pending  28s                Ambassador Edge Stack  waiting for Host status change to be reflected in snapshot
  Normal   Pending  28s                Ambassador Edge Stack  registering ACME account
  Normal   Pending  28s                Ambassador Edge Stack  ACME account registered
  Normal   Pending  28s                Ambassador Edge Stack  waiting for Host ACME account registration change to be reflected in snapshot
  Normal   Pending  20s (x2 over 28s)  Ambassador Edge Stack  tlsSecret "mylink.example"."ambassador" (hostnames=["mylink.example"]): needs updated: tlsSecret does not exist
  Normal   Pending  20s (x2 over 28s)  Ambassador Edge Stack  performing ACME challenge for tlsSecret "mylink.example"."ambassador" (hostnames=["mylink.example"])...
  Warning  Error    14s                Ambassador Edge Stack  obtaining tlsSecret "mylink.example"."ambassador" (hostnames=["mylink.example"]): error: one or more domains had a problem:
[mylink.example] context canceled

And these are logs in the edge-stack pod:

2022-02-11 14:28:08 diagd 2.2.0 [P24TAEW] ERROR: Secret mylink.example.ambassador unknown
2022-02-11 14:28:08 diagd 2.2.0 [P24TAEW] ERROR: Host grafana-host: invalid TLS secret mylink.example, marking inactive
2022-02-11 14:28:08 diagd 2.2.0 [P24TAEW] INFO: EnvoyConfig: Generating V3
2022-02-11 14:28:08 diagd 2.2.0 [P24TAEW] INFO: V3Listener: ==== GENERATED <V3Listener HTTP edge-stack-listener-8443 on 0.0.0.0:8443 [XFP]>
2022/02/11 14:28:08 [INFO] [mylink.example] acme: Obtaining bundled SAN certificate
time="2022-02-11 14:28:08.2856" level=info msg="Loaded file /ambassador/envoy/envoy.json" func=github.com/datawire/ambassador/v2/cmd/ambex.Decode file="/go/cmd/ambex/main.go:279" CMD=entrypoint PID=1 THREAD=/ambex
time="2022-02-11 14:28:08.2874" level=info msg="Saved snapshot v175" func=github.com/datawire/ambassador/v2/cmd/ambex.csDump file="/go/cmd/ambex/main.go:369" CMD=entrypoint PID=1 THREAD=/ambex
time="2022-02-11 14:28:08.2889" level=info msg="Pushing snapshot v175" func=github.com/datawire/ambassador/v2/cmd/ambex.updaterWithTicker file="/go/cmd/ambex/ratelimit.go:159" CMD=entrypoint PID=1 THREAD=/ambex
2022-02-11 14:28:08 diagd 2.2.0 [P24TAEW] INFO: configuration updated (incremental) from snapshot snapshot (S19 L1 G7 C3)
2022-02-11 14:28:08 diagd 2.2.0 [P24TThreadPoolExecutor-0_1] INFO: F5CD5402-A8F8-471B-B9F8-C5133B312E5D: 127.0.0.1 "GET /ambassador/v0/diag/" 20ms 200 success
time="2022-02-11 14:28:08.3290" level=warning msg="search is nil, not indexing" func="github.com/datawire/apro/v2/cmd/amb-sidecar/devportal/server.(*Server).IndexOpenAPIDocs" file="github.com/datawire/apro/v2/cmd/amb-sidecar/devportal/server/server.go:107" CMD=amb-sidecar PID=14 THREAD=/devportal_fetcher subsystem=fetcher
2022/02/11 14:28:08 [INFO] [mylink.example] AuthURL: https://acme-v02.api.letsencrypt.org/acme/authz-v3/77483131510
2022/02/11 14:28:08 [INFO] [mylink.example] acme: Could not find solver for: tls-alpn-01
2022/02/11 14:28:08 [INFO] [mylink.example] acme: use http-01 solver
2022/02/11 14:28:08 [INFO] [mylink.example] acme: Trying to solve HTTP-01
2022-02-11 14:28:12 diagd 2.2.0 [P24TThreadPoolExecutor-0_0] INFO: EB1F7280-A1FB-418B-BDA7-79B7846BAFFC: 127.0.0.1 "GET /ambassador/v0/diag/" 23ms 200 success
time="2022-02-11 14:28:12.2241" level=warning msg="search is nil, not indexing" func="github.com/datawire/apro/v2/cmd/amb-sidecar/devportal/server.(*Server).IndexOpenAPIDocs" file="github.com/datawire/apro/v2/cmd/amb-sidecar/devportal/server/server.go:107" CMD=amb-sidecar PID=14 THREAD=/devportal_fetcher subsystem=fetcher
2022/02/11 14:28:13 [INFO] Deactivating auth: https://acme-v02.api.letsencrypt.org/acme/authz-v3/77483131510
time="2022-02-11 14:28:13.7203" level=error msg="update \"grafana-host\".\"ambassador\": Operation cannot be fulfilled on hosts.getambassador.io \"grafana-host\": the object has been modified; please apply your changes to the latest version and try again" func="github.com/datawire/apro/v2/cmd/amb-sidecar/acmeclient.(*Controller).recordHostError" file="github.com/datawire/apro/v2/cmd/amb-sidecar/acmeclient/k8s_controller.go:424" CMD=amb-sidecar PID=14 THREAD=/acme_client host=grafana-host.ambassador namespace=ambassador secret=mylink.example
2022/02/11 14:28:13 [INFO] [mylink.example] acme: Obtaining bundled SAN certificate
2022/02/11 14:28:14 [INFO] [mylink.example] AuthURL: https://acme-v02.api.letsencrypt.org/acme/authz-v3/77483154950
2022/02/11 14:28:14 [INFO] [mylink.example] acme: Could not find solver for: tls-alpn-01
2022/02/11 14:28:14 [INFO] [mylink.example] acme: use http-01 solver
2022/02/11 14:28:14 [INFO] [mylink.example] acme: Trying to solve HTTP-01
2022/02/11 14:28:18 [INFO] Deactivating auth: https://acme-v02.api.letsencrypt.org/acme/authz-v3/77483154950
time="2022-02-11 14:28:18.2480" level=error msg="update \"grafana-host\".\"ambassador\": Operation cannot be fulfilled on hosts.getambassador.io \"grafana-host\": the object has been modified; please apply your changes to the latest version and try again" func="github.com/datawire/apro/v2/cmd/amb-sidecar/acmeclient.(*Controller).recordHostError" file="github.com/datawire/apro/v2/cmd/amb-sidecar/acmeclient/k8s_controller.go:424" CMD=amb-sidecar PID=14 THREAD=/acme_client host=grafana-host.ambassador namespace=ambassador secret=mylink.example
2022/02/11 14:28:18 [INFO] [mylink.example] acme: Obtaining bundled SAN certificate
2022/02/11 14:28:18 [INFO] [mylink.example] AuthURL: https://acme-v02.api.letsencrypt.org/acme/authz-v3/77483172590
2022/02/11 14:28:18 [INFO] [mylink.example] acme: Could not find solver for: tls-alpn-01
2022/02/11 14:28:18 [INFO] [mylink.example] acme: use http-01 solver
2022/02/11 14:28:18 [INFO] [mylink.example] acme: Trying to solve HTTP-01
2022/02/11 14:28:23 [INFO] Deactivating auth: https://acme-v02.api.letsencrypt.org/acme/authz-v3/77483172590
2022-02-11 14:28:23 diagd 2.2.0 [P24TAEW] ERROR: Secret mylink.example.ambassador unknown
2022-02-11 14:28:23 diagd 2.2.0 [P24TAEW] ERROR: Host grafana-host: invalid TLS secret mylink.example, marking inactive
2022-02-11 14:28:23 diagd 2.2.0 [P24TAEW] INFO: EnvoyConfig: Generating V3
2022-02-11 14:28:23 diagd 2.2.0 [P24TAEW] INFO: V3Listener: ==== GENERATED <V3Listener HTTP edge-stack-listener-8443 on 0.0.0.0:8443 [XFP]>
2022-02-11 14:28:24 diagd 2.2.0 [P24TAEW] INFO: configuration updated (incremental) from snapshot snapshot (S19 L1 G7 C3)
time="2022-02-11 14:28:24.0298" level=info msg="Loaded file /ambassador/envoy/envoy.json" func=github.com/datawire/ambassador/v2/cmd/ambex.Decode file="/go/cmd/ambex/main.go:279" CMD=entrypoint PID=1 THREAD=/ambex
time="2022-02-11 14:28:24.0319" level=info msg="Saved snapshot v176" func=github.com/datawire/ambassador/v2/cmd/ambex.csDump file="/go/cmd/ambex/main.go:369" CMD=entrypoint PID=1 THREAD=/ambex
time="2022-02-11 14:28:24.0338" level=info msg="Pushing snapshot v176" func=github.com/datawire/ambassador/v2/cmd/ambex.updaterWithTicker file="/go/cmd/ambex/ratelimit.go:159" CMD=entrypoint PID=1 THREAD=/ambex
2022-02-11 14:28:24 diagd 2.2.0 [P24TThreadPoolExecutor-0_0] INFO: B59A9BBF-56C5-4D31-AC8A-B1B5B8C62C27: 127.0.0.1 "GET /ambassador/v0/diag/" 62ms 200 success
time="2022-02-11 14:28:24.1128" level=warning msg="search is nil, not indexing" func="github.com/datawire/apro/v2/cmd/amb-sidecar/devportal/server.(*Server).IndexOpenAPIDocs" file="github.com/datawire/apro/v2/cmd/amb-sidecar/devportal/server/server.go:107" CMD=amb-sidecar PID=14 THREAD=/devportal_fetcher subsystem=fetcher

MatTerra avatar Feb 11 '22 14:02 MatTerra