OpenMetadata icon indicating copy to clipboard operation
OpenMetadata copied to clipboard

PKIX path building failed: OpenMetadata 1.9.1 Ignores All SSL Trust Configuration for Elasticsearch on Kubernetes

Open data-tomic opened this issue 4 months ago • 3 comments

Affected module

Backend

Describe the bug

OpenMetadata Server v1.9.1, when deployed on Kubernetes, is unable to establish a trusted SSL connection to an external Elasticsearch cluster (v8.5.1) that uses a custom Certificate Authority (CA). The application consistently fails with javax.net.ssl.SSLHandshakeException: PKIX path building failed, indicating that it is not using the provided trust material.

This issue persists despite applying all standard and community-recommended SSL configuration methods. The application appears to completely ignore any provided custom CA trust configuration, affecting both the run-db-migrations initContainer and the main openmetadata container's search client.

The infrastructure itself is confirmed to be working correctly, as Kibana and direct curl requests from within the cluster can connect to Elasticsearch successfully using the same custom CA.

To Reproduce

  1. Environment Setup:

    • Deploy Elasticsearch 8.5.1 on Kubernetes.
    • Enable TLS for the Elasticsearch HTTP interface using a custom Certificate Authority (e.g., an internal corporate CA).
    • Deploy OpenMetadata 1.9.1 using Kubernetes manifests (not the Helm chart directly, but manifests generated and adapted from it).
  2. Configuration:

    • Configure OpenMetadata to connect to the Elasticsearch service via https://<service-name>:9200.
    • Ensure the custom root CA certificate is available as a Kubernetes secret.
  3. Attempt SSL Trust Configuration (Any of the following methods):

    • Method A (JVM System Properties): Use a Kustomize patch to add an initContainer that creates a truststore.jks from the CA secret. Then, set the OPENMETADATA_OPTS environment variable for both the run-db-migrations initContainer and the main openmetadata container to -Djavax.net.ssl.trustStore=/path/to/truststore.jks -Djavax.net.ssl.trustStorePassword=....
    • Method B (Application Environment Variables): Set various environment variables in the openmetadata-connection-details secret, such as ELASTICSEARCH_SSL_VERIFY: "true" with ELASTICSEARCH_SSL_CERT_PATH: /path/to/ca.crt or ELASTICSEARCH_TRUSTSTORE_PATH.
    • Method C (Default cacerts Replacement - Community Recommended): Use an initContainer to copy the default JVM cacerts file, import the custom CA into it, and then mount this modified cacerts file over the original file (/usr/lib/jvm/java-21-openjdk/lib/security/cacerts) in both the run-db-migrations initContainer and the main openmetadata container.
  4. Observe Behavior:

    • The OpenMetadata pod starts and passes basic Kubernetes liveness/readiness probes. The UI is accessible.
    • However, any action that triggers a search query (e.g., loading the homepage, using the search bar, viewing the Health Check page) fails.
    • The logs for both the run-db-migrations container (if it gets to that point) and the main openmetadata container show the PKIX path building failed error.
    • The Health Check page shows "Search Instance" as failed with the same SSL error.

Expected behavior

The OpenMetadata server should respect at least one of the provided SSL trust configurations (ideally, the cacerts replacement method, as recommended in the community Slack for similar issues) and successfully establish a secure connection to Elasticsearch, allowing all features to work correctly.

Version:

  • OS: Ubuntu 22.04 (running MicroK8s)
  • Python version: N/A (Backend issue)
  • OpenMetadata version: 1.9.1
  • OpenMetadata Ingestion package version: 1.9.1
  • Kubernetes version: MicroK8s v1.32.3 (Kubernetes Server v1.32.3)

Additional context

1. Infrastructure is proven to be correctly configured:

  • Kibana, running in the same cluster, connects to the same Elasticsearch instance over HTTPS without any issues, using the same custom CA certificate.
  • curl -u "user:pass" https://elasticsearch-master.efk-stack.svc.cluster.local:9200 --cacert ca.pem from an ephemeral container within the cluster works perfectly. This definitively isolates the issue to the OpenMetadata Java application.

2. The cacerts replacement method is a community-sanctioned solution:

  • As seen in the OpenMetadata Slack, modifying the default cacerts file is the recommended approach for fixing PKIX errors when connecting to other services like Airflow. This strongly suggests that the Elasticsearch client within OpenMetadata is not respecting this fundamental JVM trust mechanism.

3. Successful Workaround:

  • The only way to make the system work is to switch the entire internal communication stack to HTTP, by setting ELASTICSEARCH_SCHEME: "http" in OpenMetadata and disabling TLS on the Elasticsearch service. This confirms the problem is purely related to SSL verification.

Log Snippet:``` es.org.elasticsearch.ElasticsearchException: java.util.concurrent.ExecutionException: javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target ... Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target ... Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target at java.base/sun.security.provider.certpath.SunCertPathBuilder.build(SunCertPathBuilder.java:148) ...

data-tomic avatar Aug 26 '25 08:08 data-tomic

hi @data-tomic

I want to clarify what’s happening with your setup.

When OpenMetadata tries to connect to your Elasticsearch over HTTPS, it sees a certificate signed by your internal CA. Since this isn’t from a public authority, OpenMetadata doesn’t recognize it and blocks the connection.

How OpenMetadata Works Today

  • If you provide a truststore (JKS) → OpenMetadata will use it to validate certificates.

  • If you don’t provide a truststore → it doesn’t fall back to Java’s default cacerts or JVM SSL options. It just fails with the PKIX error.

  • That’s why adding your CA to cacerts or setting JVM flags hasn’t worked — the app never looks there.

What You Need To Do

OpenMetadata needs you to explicitly tell it which certificates to trust:

  1. Export your Elasticsearch CA certificate.

  2. Convert it into a JKS truststore using keytool.

  3. Mount that JKS file into the OpenMetadata pod (both the migration initContainer and the main app).

  4. Point OpenMetadata to it with these env vars:

ELASTICSEARCH_TRUST_STORE_PATH

ELASTICSEARCH_TRUST_STORE_PASSWORD

After these steps, you should see no more PKIX path building failed in the logs.And the Search in the UI will work normally.

This is expected behavior - OpenMetadata is designed to only trust what you explicitly configure, so a JKS truststore is required when using a custom CA.

sonika-shah avatar Sep 29 '25 10:09 sonika-shah

Hi @sonika-shah Thanks for the reply to this comment, As per your suggestion i tried to do the same,

As per Values.yaml suggested by openmetadata, I did following :-

  1. Converting my ca.crt file to jks using keytool (done in local, later exported to pod) keytool -import -trustcacerts -alias elkstore -file elk-ca-bundle.crt -keystore elk-truststore.jks -storepass changeit

  2. Adding the environment variables

extraEnvs: - name: ELASTICSEARCH_TRUST_STORE_PATH value: /opt/openmetadata/truststore/elk-truststore.jks - name: ELASTICSEARCH_TRUST_STORE_PASSWORD value: changeit

  1. Mounting jks to pod

extraVolumes: - name: elasticsearch-truststore secret: secretName: elasticsearch-truststore-secrets

extraVolumeMounts: - name: elasticsearch-truststore mountPath: /opt/openmetadata/truststore/ readOnly: true

  1. Using it in elasticsearch section

elasticsearch: enabled: true host: searchType: elasticsearch port: 9200 scheme: https clusterAlias: "test-openmetadata" # Value in Bytes payLoadSize: 10485760 connectionTimeoutSecs: 5 socketTimeoutSecs: 60 batchSize: 100 searchIndexMappingLanguage: "EN" keepAliveTimeoutSecs: 600 trustStore: enabled: true path: "/opt/openmetadata/truststore/elk-truststore.jks" password: secretRef: elasticsearch-truststore-secrets secretKey: elk-truststore-password auth: enabled: true username: "" password: secretRef: elasticsearch-secret secretKey: elasticsearch-secret

But still in run-db-migrations and openmetadata i am getting the same error

` run-db-migrations Caused by: java.util.concurrent.ExecutionException: javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target

openmetadata pod ERROR [2025-10-17 10:58:18,676] [DefaultQuartzScheduler_Worker-3] o.o.s.e.s.DatabseAndSearchServiceStatusJob - Elastic Search Health Check encountered issues: java.util.concurrent.ExecutionException: javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target `

Let me know if i miss anything.

sameer916 avatar Oct 17 '25 11:10 sameer916

After much struggling I have managed to install OpenMetadata with cert-manager, opensearch and postgresql operators and ssl encrypted traffic between OpenMetadata, airflow, opensearch and postgresql.

The only part that is not encrypted is the connection from airflow to openmetadata, so still not 100%.

Hope this helps

---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: airflow-cert
  labels:
    app.kubernetes.io/name: airflow-cert
    app.kubernetes.io/part-of: company
spec:
  secretName: airflow-cert
  duration: 2160h # 90d
  renewBefore: 360h # 15d
  subject:
    organizations:
      - company
  commonName: dependencies-web.company-openmetadata.svc.cluster.local
  dnsNames:
    - dependencies-web
    - dependencies-web.company-openmetadata
    - dependencies-web.company-openmetadata.svc
    - dependencies-web.company-openmetadata.svc.cluster.local
  issuerRef:
    name: self-signed
    kind: ClusterIssuer
  keystores:
    jks:
      create: true
      passwordSecretRef:
        key: jks_password
        name: airflow-config
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: opensearch-http
  labels:
    app.kubernetes.io/name: opensearch-http
    app.kubernetes.io/part-of: company
spec:
  secretName: opensearch-http
  duration: 2160h # 90d
  renewBefore: 360h # 15d
  subject:
    organizations:
      - company
  commonName: opensearch.company-openmetadata.svc.cluster.local
  dnsNames:
    - opensearch
    - opensearch.company-openmetadata
    - opensearch.company-openmetadata.svc
    - opensearch.company-openmetadata.svc.cluster.local
  issuerRef:
    name: self-signed
    kind: ClusterIssuer
  keystores:
    jks:
      create: true
      passwordSecretRef:
        key: jks_password
        name: opensearch-config
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: opensearch-admin
  labels:
    app.kubernetes.io/name: opensearch-admin
    app.kubernetes.io/part-of: company
spec:
  secretName: opensearch-admin
  duration: 2160h # 90d
  renewBefore: 360h # 15d
  privateKey:
    algorithm: RSA
    encoding: PKCS8
    size: 2048
  subject:
    organizationalUnits:
      - opensearch
  dnsNames:
    - opensearch
  commonName: admin
  issuerRef:
    name: self-signed
    kind: ClusterIssuer
  keystores:
    jks:
      create: true
      passwordSecretRef:
        key: jks_password
        name: opensearch-config
---
apiVersion: opensearch.opster.io/v1
kind: OpenSearchCluster
metadata:
  name: opensearch
  labels:
    app.kubernetes.io/name: opensearch
    app.kubernetes.io/part-of: company
spec:
  security:
    config:
      securityConfigSecret:
        name: opensearch-securityconfig
      adminCredentialsSecret:
        name: opensearch-config
      adminSecret:
        name: opensearch-admin
    tls:
      http:
        generate: false
        secret:
          name: opensearch-http
        caSecret:
          name: opensearch-http
      transport:
        generate: true
        perNode: true
  general:
    httpPort: 9200
    serviceName: opensearch
    version: 2.19.3
    drainDataNodes: true
  dashboards:
    enable: false
    version: 2.19.3
    replicas: 1
  nodePools:
    - component: masters
      replicas: 3
      diskSize: "20Gi"
      resources:
        requests:
          memory: "1Gi"
          cpu: "500m"
        limits:
          memory: "1Gi"
      roles:
        - "data"
        - "cluster_manager"
      persistence:
        emptyDir: {}
---
apiVersion: opensearch.opster.io/v1
kind: OpensearchUser
metadata:
  name: openmetadata
spec:
  opensearchCluster:
    name: opensearch
  passwordFrom:
    name: opensearch-config
    key: openmetadata_password
  backendRoles:
    - admin
---
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: postgres
  labels:
    app.kubernetes.io/name: postgres
    app.kubernetes.io/part-of: company
spec:
  instances: 1
  storage:
    size: 20Gi
    storageClass: standard
  monitoring:
    enablePodMonitor: true
  managed:
    roles:
    - name: openmetadata
      login: true
      superuser: false
      passwordSecret:
        name: openmetadata-db-config
    - name: airflow
      login: true
      superuser: false
      passwordSecret:
        name: airflow-db-config
---
apiVersion: postgresql.cnpg.io/v1
kind: Database
metadata:
  name: openmetadata
  labels:
    app.kubernetes.io/name: postgres
    app.kubernetes.io/part-of: company
spec:
  name: openmetadata
  owner: openmetadata
  cluster:
    name: postgres
---
apiVersion: postgresql.cnpg.io/v1
kind: Database
metadata:
  name: airflow
  labels:
    app.kubernetes.io/name: postgres
    app.kubernetes.io/part-of: company
spec:
  name: airflow
  owner: airflow
  cluster:
    name: postgres
---
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
  name: dependencies
spec:
  interval: 24h
  timeout: 10m
  chart:
    spec:
      chart: openmetadata-dependencies
      version: "1.10.7"
      sourceRef:
        kind: HelmRepository
        name: openmetadata
        namespace: company-openmetadata
      interval: 24h
  driftDetection:
    mode: enabled
  install:
    crds: CreateReplace
    remediation:
      retries: -1
  upgrade:
    remediation:
      retries: -1
  values:
    image:
      tag: "1.10.7"
    mysql:
      enabled: false
    opensearch:
      enabled: false

    # Airflow Configuration with ReadWriteMany storage
    airflow:
      postgresql:
        enabled: false
      pgbouncer:
        enabled: false

      externalDatabase:
        type: postgres
        host: postgres-rw.company-openmetadata.svc.cluster.local
        port: 5432
        database: airflow
        userSecret: airflow-db-config
        userSecretKey: username
        passwordSecret: airflow-db-config
        passwordSecretKey: password
        properties: "?sslmode=verify-full&sslrootcert=/etc/ssl/postgres/ca.crt"

      dags:
        persistence:
          storageClass: shared
          size: 10Gi

      logs:
        persistence:
          storageClass: shared
          size: 10Gi

      airflow:
        extraVolumes:
          - name: airflow-cert
            secret:
              secretName: airflow-cert
          - name: postgres-ca
            secret:
              secretName: postgres-ca
        extraVolumeMounts:
          - name: airflow-cert
            mountPath: /etc/airflow/ssl
            readOnly: true
          - name: postgres-ca
            mountPath: /etc/ssl/postgres/ca.crt
            subPath: ca.crt
            readOnly: true
        config:
          AIRFLOW__WEBSERVER__WEB_SERVER_SSL_CERT: /etc/airflow/ssl/tls.crt
          AIRFLOW__WEBSERVER__WEB_SERVER_SSL_KEY: /etc/airflow/ssl/tls.key
        extraEnv:
          - name: AIRFLOW__CORE__FERNET_KEY
            valueFrom:
              secretKeyRef:
                name: airflow-config
                key: fernet_key
          - name: AIRFLOW__WEBSERVER__SECRET_KEY
            valueFrom:
              secretKeyRef:
                name: airflow-config
                key: webserver_secret_key
---
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
  name: openmetadata
spec:
  interval: 24h
  timeout: 10m
  chart:
    spec:
      chart: openmetadata
      version: "1.10.7"
      sourceRef:
        kind: HelmRepository
        name: openmetadata
        namespace: company-openmetadata
      interval: 24h
  driftDetection:
    mode: enabled
  install:
    crds: CreateReplace
    remediation:
      retries: -1
  upgrade:
    remediation:
      retries: -1
  values:
    image:
      tag: "1.10.7"
    resources:
      limits:
        memory: 2048Mi
      requests:
        cpu: 1
        memory: 2048Mi
    startupProbe:
      periodSeconds: 5
      failureThreshold: 60

    extraEnvs:
      - name: AIRFLOW_TRUST_STORE_PASSWORD
        valueFrom:
          secretKeyRef:
            name: airflow-config
            key: jks_password
      - name: AIRFLOW_TRUST_STORE_PATH
        value: "/etc/ssl/airflow/truststore.jks"

    openmetadata:
      config:
        pipelineServiceClientConfig:
          enabled: true
          apiEndpoint: https://dependencies-web:8080
          metadataApiEndpoint: http://openmetadata:8585/api
          verifySsl: "validate"
          auth:
            enabled: true
            username: admin
            password:
              secretRef: airflow-config
              secretKey: admin_password

        database:
          enabled: true
          host: postgres-rw
          port: 5432
          driverClass: org.postgresql.Driver
          dbScheme: postgresql
          databaseName: openmetadata
          auth:
            username: openmetadata
            password:
              secretRef: openmetadata-db-config
              secretKey: password
          dbParams: "sslmode=verify-full&sslrootcert=/etc/ssl/postgres/ca.crt"

        elasticsearch:
          enabled: true
          host: opensearch
          searchType: opensearch
          port: 9200
          scheme: https
          connectionTimeoutSecs: 5
          trustStore:
            enabled: true
            path: "/etc/ssl/opensearch/truststore.jks"
            password:
              secretRef: opensearch-config
              secretKey: jks_password
          auth:
            enabled: true
            username: openmetadata
            password:
              secretRef: opensearch-config
              secretKey: openmetadata_password

    extraVolumes:
      - name: opensearch-tls
        secret:
          secretName: opensearch-http
      - name: airflow-tls
        secret:
          secretName: airflow-cert
      - name: postgres-ca
        secret:
          secretName: postgres-ca

    extraVolumeMounts:
      - name: opensearch-tls
        mountPath: /etc/ssl/opensearch/truststore.jks
        subPath: truststore.jks
        readOnly: true
      - name: airflow-tls
        mountPath: /etc/ssl/airflow/truststore.jks
        subPath: truststore.jks
        readOnly: true
      - name: postgres-ca
        mountPath: /etc/ssl/postgres/ca.crt
        subPath: ca.crt
        readOnly: true

    ingress:
      enabled: true
      className: nginx
      annotations:
        cert-manager.io/cluster-issuer: letsencrypt-prod
        nginx.ingress.kubernetes.io/ssl-redirect: "true"
        nginx.ingress.kubernetes.io/enable-opentelemetry: "true"
      hosts:
        - host: openmetadata.company.com
          paths:
            - path: /
              pathType: Prefix
      tls:
        - hosts:
            - openmetadata.company.com
          secretName: openmetadata-tls


Fgruntjes avatar Nov 20 '25 10:11 Fgruntjes