consul-k8s icon indicating copy to clipboard operation
consul-k8s copied to clipboard

MeshGateway config required despite it being disabled

Open narendrapatel opened this issue 3 years ago • 6 comments

We have a VM based Consul cluster and a Kubernetes based Consul cluster in federation with VM cluster being the acl master and service mesh enabled. However, what we observed is that we still need to provide Mesh Gateway configurations despite it being disabled on the helm chart config.

Application pods in the mesh would not pass the init stage. Get the following error in pod logs:

2022-02-04T07:58:57.878Z [INFO]  Check to ensure a Kubernetes service has been created for this application. If your pod is not starting also check the connect-inject deployment logs.
2022-02-04T07:58:58.880Z [INFO]  Unable to find registered services; retrying

Get the below in connect-inject:

{"level":"error","ts":1643961512.3430398,"logger":"controller.endpoints","msg":"failed to create service registrations for endpoints","name":"web","ns":"consul","error":"upstream \"api:8000:dc1\" is invalid: ProxyDefaults mesh gateway mode is neither \"local\" nor \"remote\"","stacktrace":"[[sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile](http://sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile)](http://sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile](http://sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile))\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:114\[[nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler](http://nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler)](http://nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler](http://nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler))\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:311\[[nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem](http://nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem)](http://nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem](http://nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem))\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266\[[nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2](http://nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2)](http://nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2](http://nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2))\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227"}
{"level":"error","ts":1643961512.343221,"logger":"controller.endpoints","msg":"failed to register services or health check","name":"web","ns":"consul","error":"upstream \"api:8000:dc1\" is invalid: ProxyDefaults mesh gateway mode is neither \"local\" nor \"remote\"","stacktrace":"[[sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler](http://sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler)](http://sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler](http://sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler))\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:311\[[nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem](http://nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem)](http://nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem](http://nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem))\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266\[[nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2](http://nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2)](http://nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2](http://nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2))\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227"}

As a solution need to set the MeshGateway mode in ProxyDefaults to local / remote(none does not work) for the pods to start up correctly. However post starting up envoy does not receive any upstream cluster endpoints. Have to again set the MeshGateway mode in ServiceDefaults to none for each service for their envoy sidecars to load up upstream cluster endpoints.

Consul Version

1.11.1

narendrapatel avatar Feb 07 '22 06:02 narendrapatel

Hi @narendrapatel, thank you for reaching out. I will try to replicate this and see where the disconnect is in Consul. Would you be so kind as to share the full Helm config you used? That will help me replicate the issue.

t-eckert avatar Feb 09 '22 16:02 t-eckert

Hi @t-eckert,

Thanks for the revert :)

Please find the below config used for helm.

  global:
    image: "hashicorp/consul:1.11.1"
    datacenter: dc2
    federation:
      enabled: false
    imageEnvoy: "envoyproxy/envoy-alpine:v1.18.2"
  server:
    replicas: 1
    securityContext:
      runAsNonRoot: false
      runAsGroup: 0
      runAsUser: 0
      fsGroup: 0
    extraConfig: |
      {
        "primary_datacenter": "dc1",
        "retry_join_wan": ["10.29.149.94"]
      }
  meshGateway:
    enabled: false
  client:
    tolerations: |
      - key: "cloud.google.com/gke-preemptible"
        operator: Equal
        value: "true"
        effect: NoSchedule

Chart version used : version: 0.39.0

narendrapatel avatar Feb 11 '22 18:02 narendrapatel

This is because a) we have an assumption that kube clusters are all federated using mesh federation and b) we wanted to warn users that if they're using another dc upstream and they don't have mesh gateway mode set that nothing will work

Maybe a simple solution is a new value that can silence this error?

connectInject:
  validateRemoteDCUpstreams: true

Defaults to true but can be set to false?

lkysow avatar Feb 14 '22 19:02 lkysow

Hi @narendrapatel hopefully we answered your question. I'll go ahead and close this issue, let us know if you have any follow up from our previous response!

david-yu avatar Apr 20 '22 06:04 david-yu

Hi @david-yu, I guess the above was a suggestion from @lkysow to be implemented in Consul k8s. Don't think we have this config as of now. Thanks.

narendrapatel avatar May 17 '22 16:05 narendrapatel

Ok thanks let me re-open for tracking.

david-yu avatar May 17 '22 16:05 david-yu