amazon-cloudwatch-agent
amazon-cloudwatch-agent copied to clipboard
Missing data points due to incorrect timestamp
Describe the bug
We regularly see gaps in metrics data points which seem to be caused by variance in the timestamp in the data sent to /aws/containerinsights/<cluster>/performance
log group.
Here is an example (node_memory_utilization metric for particular EKS worker node):
Data for each minute was sent to CloudWatch
For example, data received at 16:37 had timestamp 1663792616685 which is 16:36:56, data received at 16:38 had timestamp 1663792684433 which is 16:38:04.
Steps to reproduce Provision EKS cluster, deploy CloudWatch agent as a daemonset with default configuration After some time this starts to happen to at least one worker node.
What did you expect to see? Data is reported for every minute
What did you see instead? There are gaps in data but no longer than 2 minutes (1 data point missing or ingested with incorrect timestamp)
What version did you use? public.ecr.aws/cloudwatch-agent/cloudwatch-agent:1.247354.0b251981
What config did you use? Config:
{
"agent": {
"region": "us-east-1"
},
"logs": {
"metrics_collected": {
"kubernetes": {
"cluster_name": "<redacted>-eksCluster-cc71680",
"metrics_collection_interval": 60
}
},
"force_flush_interval": 5
}
}
Environment AWS EKS 1.21 with self-managed worker nodes and custom AMI. CloudWatch agent running as daemonset