consul-esm icon indicating copy to clipboard operation
consul-esm copied to clipboard

Add support for TCP+TLS health checks

Open jameshartig opened this issue 1 year ago • 3 comments

TCP+TLS health checks were added in https://github.com/hashicorp/consul/pull/18381 but from what I can tell they're not supported in consul-esm.

jameshartig avatar Jan 11 '24 17:01 jameshartig

Since it is working for me, could you please expand on the error you may be noticing?

With tcp alone:

With the below definition, consul-esm was able to identify the actual service going down.

{
  "Node": "venus-dc-ext-count-tls-node",
  "Address": "172.31.26.18",
    "Token": "3d5f4ccd-c076-92b1-c88e-2abe70493e2a",
    "NodeMeta": {
    "external-node": "true",
    "external-probe": "true"
  },
  "Service": {
    "ID": "venus-dc-count-tls",
    "Service": "venus-dc-ext-count-tls",
    "Port": 10017
  },
  "Checks": [
    {
      "Name": "venus-dc-ext-count-tls-check",
      "Status": "passing",
      "Definition": {
        "Name": "venus-dc-ext-count-tls TCP check on port 172.31.26.18:10017",
        "TCP": "172.31.26.18:10017",
        "Interval": "10s",
        "Timeout": "1s"
      }
    }
  ]
}

consul-esm identified the below:

2024-01-17T06:55:48.247Z [WARN]  consul-esm: Check is now critical: check=venus-dc-ext-count-node/venus-dc-ext-count-check
2024-01-17T06:55:52.583Z [WARN]  consul-esm: Check socket connection failed: check=venus-dc-ext-count-tls-node/venus-dc-ext-count-tls-check error="dial tcp 172.31.26.18:10017: connect: connection refused"

With a TLS health check:

With an external service definition as below -

(venv) root@ip-172-31-18-50:~# cat tgw-app-count-tls.json
{
  "Node": "venus-dc-ext-count-tls-node",
  "Address": "172.31.26.18",
    "Token": "885cb598-b105-e554-f7f3-ed084d760f32",
    "NodeMeta": {
    "external-node": "true",
    "external-probe": "true"
  },
  "Service": {
    "ID": "venus-dc-count-tls",
    "Service": "venus-dc-ext-count-tls",
    "Port": 10017
  },
  "Checks": [
    {
      "Name": "venus-dc-ext-count-tls-check",
      "Status": "passing",
      "Definition": {
        "Name": "venus-dc-ext-count-tls TCP check on port 172.31.26.18:10017",
        "HTTP": "https://172.31.26.18:10017/health",
        "Interval": "10s",
        "Timeout": "1s"
      }
    }
  ]
}

(venv) root@ip-172-31-18-50:~#

And, a consul-esm config file and start as below -

(venv) root@ip-172-31-26-18:~# cat $PWD/consul-esm-config.hcl
https_ca_file = "/opt/consul/custom-apps/tgw/certs/venus-srv.com.crt"
(venv) root@ip-172-31-26-18:~# consul-esm -config-file $PWD/consul-esm-config.hcl

consul-esm was able to run my health checks. Below is the output as seen from the link https://<host>:8501/ui/venus-dc/services/venus-dc-ext-count-tls/instances/venus-dc-ext-count-tls-node/venus-dc-count-tls/health-checks

-
Output
HTTP GET https://172.31.26.18:10017/health: 200 OK Output: {"hostname":"ip-172-31-26-18","inside_function":"/opt/consul/custom-apps/tgw/tgw-count-tls.py['health']","response":"healthy"}

vyanamandra avatar Jan 17 '24 07:01 vyanamandra

@vyanamandra your example used a HTTP health check and not TCP. I'm talking about a TCP health check with TCPUseTLS set to true. Please see the linked MR for consul in the issue description.

jameshartig avatar Jan 17 '24 08:01 jameshartig

@jameshartig I had no idea this project existed when I added TCP+TLS to consul itself. Sorry about that. I don't believe it's been plumbed through nomad either.

pgporada avatar Sep 05 '24 03:09 pgporada