aws-app-mesh-examples icon indicating copy to clipboard operation
aws-app-mesh-examples copied to clipboard

Please consider adding a walkthrough for cross-cluster mTLS support with unique trust_domains for each cluster

Open caleygoff-invitae opened this issue 4 years ago • 0 comments

The walkthroughs have been super helpful so far and very illuminating. It would be helpful to also include a walkthrough that explained how to setup a cross cluster configuration with SPIRE/SPIFFE as serving as the mTLS SVID authority using an unique trust_domain for each cluster.

I do see these issues on envoy here and here and istio here . There is documentation on spire/spiffe here about trust domains and their bundles here and I think retrieving those bundles here

It is interesting, I’m attempting to use an unique trust_domain in each spire/spiffe configuration for each cluster which would be the fqdn of our clusters. Below is the a snippet of log output from the envoy container on the VirtualNode which I’ve included the configuration posted below. Note the two different trust domains demo1.dev.somecorp.net and demo2.dev.somecorp.net where I have the front app on demo1 and the color apps on demo2 . I do not see an exactly clear way to trust the domain from my remote cluster.

[2021-04-06 14:43:56.319][102][debug][client] [source/common/http/codec_client.cc:96] [C196] disconnect. resetting 0 pending requests
[2021-04-06 14:43:56.319][102][debug][pool] [source/common/conn_pool/conn_pool_base.cc:314] [C196] client disconnected, failure reason: TLS error: Secret is not supplied by SDS
[2021-04-06 14:43:56.319][102][debug][router] [source/common/router/router.cc:1031] [C195][S15452765140832383970] upstream reset: reset reason: local reset, transport failure reason: TLS error: Secret is not supplied by SDS
[2021-04-06 14:43:56.323][102][debug][router] [source/common/router/router.cc:1533] [C195][S15452765140832383970] performing retry
[2021-04-06 14:43:56.323][102][debug][pool] [source/common/http/conn_pool_base.cc:71] queueing stream due to no available connections
[2021-04-06 14:43:56.323][102][debug][pool] [source/common/conn_pool/conn_pool_base.cc:104] creating a new connection
[2021-04-06 14:43:56.323][102][debug][config] [source/extensions/transport_sockets/tls/ssl_socket.cc:348] Create NotReadySslSocket
[2021-04-06 14:43:56.323][102][debug][client] [source/common/http/codec_client.cc:39] [C197] connecting
---
apiVersion: appmesh.k8s.aws/v1beta2
kind: VirtualNode
metadata:
  name: front
  namespace: cross-cluster-test
spec:
  podSelector:
    matchLabels:
      app: front
  listeners:
    - portMapping:
        port: 8080
        protocol: http
      healthCheck:
        protocol: http
        path: '/ping'
        healthyThreshold: 2
        unhealthyThreshold: 2
        timeoutMillis: 2000
        intervalMillis: 5000
  backends:
    - virtualService:
        virtualServiceARN: arn:aws:appmesh:us-east-1:XXXXXXXXXXXX:mesh/dev/virtualService/color.cross-cluster-test.svc.cluster.local
  backendDefaults:
    clientPolicy:
      tls:
        mode: STRICT
        certificate:
          sds:
            secretName: spiffe://demo1.dev.somecorp.net/cross-cluster-test/front
        validation:
          trust:
            sds:
              secretName: spiffe://demo1.dev.somecorp.net
          subjectAlternativeNames:
            match:
              exact:
                - spiffe://demo2.dev.somecorp.net/cross-cluster-test/blue
                - spiffe://demo2.dev.somecorp.net/cross-cluster-test/red
                - spiffe://demo2.dev.somecorp.net/cross-cluster-test/green
  serviceDiscovery:
    awsCloudMap:
      namespaceName: mesh.dev.somecorp.net
      serviceName: front-demo1

Any help with this would be appreciated.

caleygoff-invitae avatar Apr 12 '21 17:04 caleygoff-invitae