cloud-on-k8s icon indicating copy to clipboard operation
cloud-on-k8s copied to clipboard

`is_managed` not correctly updating agent policies

Open dragonfleas opened this issue 6 months ago • 6 comments

Here's my values.yaml for the eck-stack, I've tried to update one of the kube-state-metrics input hosts to - "kube-state-metrics:8080/metrics/TEST" as a test to make sure the policies were updating and they are not, I've attempted:

  • Recreating the fleet server pod, then the agent pod
  • Recreating the Kibana pod
  • Verifying the configuration is correct in the generated kibana.yml secret
  • Checked the logs and found [2025-06-27T20:09:30.377+00:00][INFO ][plugins.fleet] Found 0 package policies that need agent policy revision bump in the kibana logs

I'm on version 3.0.0 of the eck-operator chart & 0.15.0 of the eck-stack chart

I don't see any reason why is_managed wouldn't be working, is this perhaps a regression?

eck-elasticsearch:
  enabled: true
  # This is adjusting the full name of the elasticsearch resource so that both the eck-elasticsearch
  # and the eck-kibana chart work together by default in the eck-stack chart.
  fullnameOverride: elasticsearch

  nodeSets:
  - name: default
    count: 3
    # Comment out when setting the vm.max_map_count via initContainer, as these are mutually exclusive.
    # For production workloads, it is strongly recommended to increase the kernel setting vm.max_map_count to 262144
    # and leave node.store.allow_mmap unset.
    # ref: https://www.elastic.co/guide/en/cloud-on-k8s/master/k8s-virtual-memory.html
    #
    podTemplate:
      spec:
        initContainers:
        - name: sysctl
          command: ["sh", "-c", "sysctl -w vm.max_map_count=262144"]
          securityContext:
            privileged: true
            runAsUser: 0

eck-kibana:
  enabled: true

  fullnameOverride: kibana

  # This is also adjusting the kibana reference to the elasticsearch resource named previously so that
  # both the eck-elasticsearch and the eck-kibana chart work together by default in the eck-stack chart.
  elasticsearchRef:
    name: elasticsearch

  config:
    server:
      publicBaseUrl: "https://kibana.REDACTED"
    xpack.fleet.agents.elasticsearch.hosts: ["https://elasticsearch-es-http.elastic-cloud.svc:9200"]
    xpack.fleet.agents.fleet_server.hosts: ["https://fleet-server-agent-http.elastic-cloud.svc:8220"]
    xpack.fleet.packages:
    - name: system
      version: latest
    - name: elastic_agent
      version: latest
    - name: fleet_server
      version: latest
    - name: kubernetes
      version: latest
    xpack.fleet.agentPolicies:
    - name: Fleet Server on ECK policy
      id: eck-fleet-server
      namespace: default
      is_managed: true
      monitoring_enabled:
      - logs
      - metrics
      package_policies:
      - name: fleet_server-1
        id: fleet_server-1
        package:
          name: fleet_server
    - name: Elastic Agent on ECK policy
      id: eck-agent
      namespace: default
      is_managed: true
      monitoring_enabled:
      - logs
      - metrics
      unenroll_timeout: 900
      package_policies:
      - name: system-1
        id: system-1
        package:
          name: system
      - name: kubernetes-1
        id: kubernetes-1
        package:
          name: kubernetes
        inputs:
          kube-state-metrics-kubernetes/metrics:
            enabled: true
            streams:
              '[kubernetes.state_container]':
                enabled: true
                vars:
                  add_metadata: true
                  hosts:
                  - "kube-state-metrics:8080/metrics/TEST"
                  leaderelection: true
                  period: 10s
                  bearer_token_file: "/var/run/secrets/kubernetes.io/serviceaccount/token"

              '[kubernetes.state_cronjob]':
                enabled: true
                vars:
                  add_metadata: true
                  hosts:
                  - "kube-state-metrics:8080/metrics"
                  leaderelection: true
                  period: 10s
                  bearer_token_file: "/var/run/secrets/kubernetes.io/serviceaccount/token"

              '[kubernetes.state_daemonset]':
                enabled: true
                vars:
                  add_metadata: true
                  hosts:
                  - "kube-state-metrics:8080/metrics"
                  leaderelection: true
                  period: 10s
                  bearer_token_file: "/var/run/secrets/kubernetes.io/serviceaccount/token"

              '[kubernetes.state_deployment]':
                enabled: true
                vars:
                  add_metadata: true
                  hosts:
                  - "kube-state-metrics:8080/metrics"
                  leaderelection: true
                  period: 10s
                  bearer_token_file: "/var/run/secrets/kubernetes.io/serviceaccount/token"

              '[kubernetes.state_job]':
                enabled: true
                vars:
                  add_metadata: true
                  hosts:
                  - "kube-state-metrics:8080/metrics"
                  leaderelection: true
                  period: 10s
                  bearer_token_file: "/var/run/secrets/kubernetes.io/serviceaccount/token"

              '[kubernetes.state_namespace]':
                enabled: true
                vars:
                  add_metadata: true
                  hosts:
                  - "kube-state-metrics:8080/metrics"
                  leaderelection: true
                  period: 10s
                  bearer_token_file: "/var/run/secrets/kubernetes.io/serviceaccount/token"

              '[kubernetes.state_node]':
                enabled: true
                vars:
                  add_metadata: true
                  hosts:
                  - "kube-state-metrics:8080/metrics"
                  leaderelection: true
                  period: 10s
                  bearer_token_file: "/var/run/secrets/kubernetes.io/serviceaccount/token"

              '[kubernetes.state_persistentvolume]':
                enabled: true
                vars:
                  add_metadata: true
                  hosts:
                  - "kube-state-metrics:8080/metrics"
                  leaderelection: true
                  period: 10s
                  bearer_token_file: "/var/run/secrets/kubernetes.io/serviceaccount/token"

              '[kubernetes.state_persistentvolumeclaim]':
                enabled: true
                vars:
                  add_metadata: true
                  hosts:
                  - "kube-state-metrics:8080/metrics"
                  leaderelection: true
                  period: 10s
                  bearer_token_file: "/var/run/secrets/kubernetes.io/serviceaccount/token"

              '[kubernetes.state_pod]':
                enabled: true
                vars:
                  add_metadata: true
                  hosts:
                  - "kube-state-metrics:8080/metrics"
                  leaderelection: true
                  period: 10s
                  bearer_token_file: "/var/run/secrets/kubernetes.io/serviceaccount/token"

              '[kubernetes.state_replicaset]':
                enabled: true
                vars:
                  add_metadata: true
                  hosts:
                  - "kube-state-metrics:8080/metrics"
                  leaderelection: true
                  period: 10s
                  bearer_token_file: "/var/run/secrets/kubernetes.io/serviceaccount/token"

              '[kubernetes.state_resourcequota]':
                enabled: true
                vars:
                  add_metadata: true
                  hosts:
                  - "kube-state-metrics:8080/metrics"
                  leaderelection: true
                  period: 10s
                  bearer_token_file: "/var/run/secrets/kubernetes.io/serviceaccount/token"

              '[kubernetes.state_service]':
                enabled: true
                vars:
                  add_metadata: true
                  hosts:
                  - "kube-state-metrics:8080/metrics"
                  leaderelection: true
                  period: 10s
                  bearer_token_file: "/var/run/secrets/kubernetes.io/serviceaccount/token"

              '[kubernetes.state_statefulset]':
                enabled: true
                vars:
                  add_metadata: true
                  hosts:
                  - "kube-state-metrics:8080/metrics"
                  leaderelection: true
                  period: 10s
                  bearer_token_file: "/var/run/secrets/kubernetes.io/serviceaccount/token"

              '[kubernetes.state_storageclass]':
                enabled: true
                vars:
                  add_metadata: true
                  hosts:
                  - "kube-state-metrics:8080/metrics"
                  leaderelection: true
                  period: 10s
                  bearer_token_file: "/var/run/secrets/kubernetes.io/serviceaccount/token"

  tls:
    # This has to be disabled for the ingress to work properly.
    # Traefik doesn"t have a great way to handle end to end tls encryption
    # on non-standard ports.
    selfSignedCertificate:
      disabled: true

  ingress:
    enabled: true
    annotations:
      traefik.ingress.kubernetes.io/router.middlewares: REDACTED@kubernetescrd
      traefik.ingress.kubernetes.io/router.entrypoints: websecure,web
      kubernetes.io/ingress.class: traefik
    tls:
      enabled: false
      # secretName: REDACTED
    pathType: Prefix
    hosts:
    - host: "kibana.REDACTED"
    path: "/"

eck-agent:
  enabled: true

  # Agent policy to be used.
  policyID: eck-agent
  # Reference to ECK-managed Kibana instance.
  #
  kibanaRef:
    name: kibana
  elasticsearchRefs: []
  # Reference to ECK-managed Fleet instance.
  #
  fleetServerRef:
    name: fleet-server

  mode: fleet
  daemonSet:
    podTemplate:
      spec:
        serviceAccountName: elastic-agent
        hostNetwork: true
        dnsPolicy: ClusterFirstWithHostNet
        automountServiceAccountToken: true
        securityContext:
          runAsUser: 0
        containers:
        - name: agent
          resources:
            requests:
              memory: 2Gi
              cpu: 1
            limits:
              memory: 2Gi
              cpu: 1
          volumeMounts:
          - mountPath: /var/lib/docker/containers
            name: varlibdockercontainers
          - mountPath: /var/log/
            name: varlog
        volumes:
        - name: varlibdockercontainers
          hostPath:
            path: /var/lib/docker/containers
        - name: varlog
          hostPath:
            path: /var/log

  clusterRole:
    name: elastic-agent
    rules:
    - apiGroups: [""] # "" indicates the core API group
      resources:
      - namespaces
      - pods
      - nodes
      - nodes/metrics
      - nodes/proxy
      - nodes/stats
      - events
      verbs:
      - get
      - watch
      - list
    - nonResourceURLs:
      - /metrics
      verbs:
      - get
      - watch
      - list
    - apiGroups: ["coordination.k8s.io"]
      resources:
        - leases
      verbs:
        - get
        - create
        - update
    - apiGroups: ["apps"]
      resources:
      - replicasets
      verbs:
      - get
      - list
      - watch
    - apiGroups:
      - "apps"
      resources:
      - statefulsets
      - deployments
      - replicasets
      - daemonsets
      verbs:
      - "get"
      - "list"
      - "watch"
    - apiGroups: ["batch"]
      resources:
      - jobs
      - cronjobs
      verbs:
      - get
      - list
      - watch
    - apiGroups:
      - "storage.k8s.io"
      resources:
      - storageclasses
      verbs:
      - "get"
      - "list"
      - "watch"
eck-fleet-server:
  enabled: true

  fullnameOverride: "fleet-server"

  deployment:
    replicas: 1
    podTemplate:
      spec:
        serviceAccountName: fleet-server
        automountServiceAccountToken: true

  # Agent policy to be used.
  policyID: eck-fleet-server
  kibanaRef:
    name: kibana
  elasticsearchRefs:
  - name: elasticsearch

eck-beats:
  enabled: false

eck-logstash:
  enabled: true

eck-apm-server:
  enabled: false

eck-enterprise-search:
  enabled: false

dragonfleas avatar Jun 27 '25 20:06 dragonfleas

We encountered the same issue, is there an update on this? At the moment, we're making adjustments to the package policies via the API, but this can't be the intended solution, part of the process is handled through CRDs and another part through the API.

SebastianZ84 avatar Jul 29 '25 07:07 SebastianZ84

I encounter the same issue. Although I can´t see that it is related to is_managed as I see the issue both with is_managed set to true and false. I can see that the agent_policy seem to be able to update but not the package policies. Changing the id of a policy will create a new with the new settings, so it seems like it only creates but will not update once created. Also removing the package policy and restarting or updating kibana will not create the package policy again. I´m on eck-operator 3.0.0.

[2025-07-29T14:42:43.632+00:00][INFO ][plugins.fleet] Found 0 package policies that need agent policy revision bump

fredddie3 avatar Jul 29 '25 14:07 fredddie3

The underlying functionality is actually in Kibana not in the ECK operator, so it would be good to know which version of Kibana is affected.

Can those of you that have encountered this issue maybe share the Kibana/Elastic Stack version you were using?

We can then forward/move the bug report to the Kibana repository.

pebrc avatar Jul 30 '25 13:07 pebrc

I´m runining Kibana 9.0.3.

fredddie3 avatar Jul 30 '25 13:07 fredddie3

And can note that Elasticsearch is also running 9.0.3 and fleet-server is running 9.0.0 if needed.

fredddie3 avatar Jul 30 '25 13:07 fredddie3

After doing a bit of research I believe has been previously reported in https://github.com/elastic/kibana/issues/111401 and summarised in https://github.com/elastic/kibana/issues/190333

pebrc avatar Jul 30 '25 14:07 pebrc