vector icon indicating copy to clipboard operation
vector copied to clipboard

The kubernetes_logs source with in-cluster config doesn't work with intermediate CAs

Open NeonSludge opened this issue 4 years ago • 3 comments

Vector Version

vector 0.14.0 (x86_64-unknown-linux-gnu 5f3a319 2021-06-03)

Vector Configuration File

# Configuration for vector.
# Docs: https://vector.dev/docs/

data_dir = "/vector-data-dir"

[api]
  enabled = false
  address = "0.0.0.0:8686"
  playground = true

[log_schema]
  host_key = "host"
  message_key = "message"
  source_type_key = "source_type"
  timestamp_key = "timestamp"

# Ingest logs from Kubernetes.
[sources.kubernetes_logs]
  type = "kubernetes_logs"


# Emit internal Vector metrics.
[sources.internal_metrics]
  type = "internal_metrics"

# Expose metrics for scraping in the Prometheus format.
[sinks.prometheus_sink]
  address = "0.0.0.0:9090"
  inputs = ["internal_metrics"]
  type = "prometheus"


[sinks.stdout]
  encoding = "json"
  inputs = ["kubernetes_logs"]
  target = "stdout"
  type = "console"

Expected Behavior

Vector is able to successfully communicate with Kubernetes API.

Actual Behavior

Vector logs certificate verification errors even with relevant CA certificates added to the system trust store inside its containers.

Example Data

Jul 16 08:19:36.982 ERROR source{component_kind="source" component_name=kubernetes_logs component_type=kubernetes_logs}: vector::internal_events::kubernetes::instrumenting_watcher: Watch invocation failed. error=Recoverable { source: Request { source: CallRequest { source: hyper::Error(Connect, ConnectError { error: Error { code: ErrorCode(1), cause: Some(Ssl(ErrorStack([Error { code: 337047686, library: "SSL routines", function: "tls_process_server_certificate", reason: "certificate verify failed", file: "ssl/statem/statem_clnt.c", line: 1915 }]))) }, verify_result: X509VerifyResult { code: 2, error: "unable to get issuer certificate" } }) } } } internal_log_rate_secs=5

Jul 16 08:19:36.982 WARN source{component_kind="source" component_name=kubernetes_logs component_type=kubernetes_logs}: vector::internal_events::kubernetes::reflector: Http Error in invocation! Your k8s metadata may be stale. Continuing Loop. error=Request { source: CallRequest { source: hyper::Error(Connect, ConnectError { error: Error { code: ErrorCode(1), cause: Some(Ssl(ErrorStack([Error { code: 337047686, library: "SSL routines", function: "tls_process_server_certificate", reason: "certificate verify failed", file: "ssl/statem/statem_clnt.c", line: 1915 }]))) }, verify_result: X509VerifyResult { code: 2, error: "unable to get issuer certificate" } }) } }

Additional Context

Our Kubernetes clusters use intermediate certificate authorities to issue all other certificates: /etc/kubernetes/pki/ca.crt, among others, is an intermediate CA. We're trying to deploy Vector as an agent via the official Helm chart. The kubernetes_logs source in Vector doesn't work even after injecting the CA certificate trust stores inside Vector's containers with relevant certificates: it fails to communicate with kube-apiserver and logs unable to get issuer certificate errors. What's confusing is that openssl verify /var/run/secrets/kubernetes.io/serviceaccount/ca.crt returns OK inside the container but Vector still logs certificate verification errors. We've also tried adding curl to the containers and issuing requests to Kubernetes API (without specifying any additional options like --insecure or --cacert) and it works just fine (no TLS errors).

NeonSludge avatar Jul 16 '21 10:07 NeonSludge

So we've managed to get Vector working.

It turns out, this has to do with how different Kubernetes client libraries deal with certificate chains. Most of the software running in our clusters is written in Go and uses the kubernetes/client-go client which apparently treats the /var/run/secrets/kubernetes.io/serviceaccount/ca.crt CA as a trusted root certificate that the kube-apiserver certificate needs to chain up to. This means that verification works even when that CA is really an intermediate: we've been able to use Filebeat agents and other software that communicates with the k8s API without any issues.

It seems that Vector wants to build a chain up to a real trusted root instead, but ignores the system trust stores which means that verification will never work if /var/run/secrets/kubernetes.io/serviceaccount/ca.crt does not contain a complete CA trust bundle.

We use kubeadm to bootstrap our clusters and it sets the --root-ca-file command line option for the kube-controller-manager processes to point to the main k8s CA certificate in /etc/kubernetes/pki/ca.crt by default. We've redirected this option to a proper CA bundle that contains the main k8s CA and our trusted root and now Vector just works.

Not sure which would be the right way to deal with this: mention the --root-ca-file option in the docs, make Vector respect the system TLS trust store or both.

NeonSludge avatar Jul 19 '21 13:07 NeonSludge

Not sure which would be the right way to deal with this: mention the --root-ca-file option in the docs, make Vector respect the system TLS trust store or both.

I didn't think we need to take the system TLS store into account, since the in-cluster access interface explicitly provides the CA file, and kind of assumed it should be the whole bundle. It is almost always the case from inside the container.

But this case makes sense, especially at the host-level deployment. Vector should be able to work with an intermediate cert and use the system trust store to build the full chain.

It is a rather odd setup for the container though.

MOZGIII avatar Sep 17 '21 06:09 MOZGIII

thanks @NeonSludge, I had exactly the same thing

pschulten avatar Jan 25 '24 14:01 pschulten