cloudbeat icon indicating copy to clipboard operation
cloudbeat copied to clipboard

[BUG] After applying new EKS integration Cloudbeat send errors regarding file configuration and cluster's name.

Open gurevichdmitry opened this issue 2 years ago • 10 comments

Describe the bug

When creating new KSPM integration with EKS cluster deployment (correct credentials provided), cloudbeat the following error logs appear:

19:02:34.025 elastic_agent.cloudbeat [elastic_agent.cloudbeat][error] Failed to parse file configuration for process kubelet, error -type is not supported
19:02:34.045 elastic_agent.cloudbeat [elastic_agent.cloudbeat][error] fail to resolve the name of the cluster, error unable to retrieve cluster identifiers

Preconditions ELK Stack and EKS clusters are deployed

To Reproduce Write the exact actions one should perform in order to reproduce the bug. Steps to reproduce the behavior:

  1. Create new KSPM integration
  2. Select EKS Kubernetes deployment
  3. Fill correct AWS credentials
  4. Save and Continue
  5. Add new agent and deploy to EKS cluster
  6. Navigate to Fleet
  7. Open agent logs
  8. Filter dataset = elastic_agent.cloudbeat
  9. Select log level = error

Expected behavior There are no error logs regarding file configuration and cluster's name.

Desktop (please complete the following information):

  • OS: NA
  • Browser: NA
  • Kibana Version: 8.6.0 Commit: 93183bddac40f8a7ee8e566d1651f9f3b586a520
  • Endpoint Version: elastic-agent 8.6.0: 3bff6ff456946a30f5b743e6d36902e472c67e48
  • Other Version:

gurevichdmitry avatar Dec 29 '22 17:12 gurevichdmitry

19:02:34.045 elastic_agent.cloudbeat [elastic_agent.cloudbeat][error] fail to resolve the name of the cluster, error unable to retrieve cluster identifiers

I don't think that's a bug, we introduce EKS cluster name detection only on 8.7.

We said that on 8.6, it should use the clusterId instead.

FYI @gurevichdmitry @kfirpeled

ofiriro3 avatar Dec 29 '22 19:12 ofiriro3

@ofiriro3 is logging as an error make sense in that case?

oren-zohar avatar Dec 29 '22 23:12 oren-zohar

@oren-zohar, @kfirpeled, @ofiriro3 please note that two different errors appeared in the log. First error is for the cluster id, and the second is for file process configuration. If cluster id shall appear only in 8.7, I'm wondering how it appeared in 8.6 version. If that is the case I think we should review version branches management strategy.

gurevichdmitry avatar Jan 02 '23 07:01 gurevichdmitry

not sure I understand what is the issue, or how it relates to our branching strategy @gurevichdmitry

oren-zohar avatar Jan 02 '23 10:01 oren-zohar

First error is for the cluster id, and the second is for file process configuration. If cluster id shall appear only in 8.7, I'm wondering how it appeared in 8.6 version. If that is the case I think we should review version branches management strategy

The error you mentioned is referring to the cluster name not the cluster id.

19:02:34.045 elastic_agent.cloudbeat [elastic_agent.cloudbeat][error] fail to resolve the name of the cluster, error unable to retrieve cluster identifiers

The addition of cluster id is already supported in 8.6 for all cluster types. So in the case of an EKS cluster, since we haven't implemented the relevant cluster_name_identifier, it should show the cluster Id.

For example (our 8.6 daily environment): image

ofiriro3 avatar Jan 02 '23 10:01 ofiriro3

@ofiriro3 is logging as an error make sense in that case?

@oren-zohar when I implemented the solution we thought that we will implement EKS cluster name detection for 8.6 as well, so during that time it did make sense.

Since we don't support identifying the EKS cluster on 8.6, I assume it should be a debug log if any log at all.

To be honest, I think that we should've merged it also to 8.6 since it looks a bit awkward that we show the cluster name for Vanilla and cluster id for EKS. It's not consistent and it's confusing.

ofiriro3 avatar Jan 02 '23 11:01 ofiriro3

fyi @kfirpeled @tehilashn

ofiriro3 avatar Jan 02 '23 11:01 ofiriro3

@gurevichdmitry @ofiriro3 regarding this situation - does the error changes anything in the agent status? is it still healthy?

oren-zohar avatar Jan 02 '23 14:01 oren-zohar

@oren-zohar agent still healthy, just need to check why those error logs appear.

gurevichdmitry avatar Jan 02 '23 14:01 gurevichdmitry

@gurevichdmitry what do you mean? didn't @ofiriro3 cover that part?

oren-zohar avatar Jan 02 '23 15:01 oren-zohar