kube-prometheus
kube-prometheus copied to clipboard
Adding AWS CNI Metrics
kube-prometheus was installed using the quick start.
kube-prometheus provides an example (examples/eks-cni-example.jsonnet) and EKS-cni-support page refers to the same.
The doc refers to prometheus-serviceMonitorAwsEksCNI.yaml.
I tried to use jsonnet to generate the new yaml but have been unsuccessful. How do I generate the yaml file?
Here are the instructions I used to compile the examples/eks-cni-example.jsonnet to monitor the AWS VPC CNI Metrics.
Hope this helps.
Install go
wget https://dl.google.com/go/go1.13.7.linux-amd64.tar.gz
tar xvzf go1.13.7.linux-amd64.tar.gz
sudo mv go /usr/local
export PATH=$PATH:/usr/local/go/bin:~/go/bin
Run go version to verify go is in the path
go version
go version go1.13.7 linux/amd64
Install the pre-requisites for compiling additional jsonnet files
go get github.com/jsonnet-bundler/jsonnet-bundler/cmd/jb
go get github.com/brancz/gojsontoyaml
go get github.com/google/go-jsonnet/cmd/jsonnet
Get the kube-prometheus repo
git clone [email protected]:coreos/kube-prometheus.git
cd kube-prometheus
jb install
This will source the jsonnetfile.json and download the necessary dependencies and create the vendor dir
Copy the eks-cni-example.jsonnet and compile it
cp examples/eks-cni-example.jsonnet .
./build.sh eks-cni-example.jsonnet
This will delete and recreate all the manifests including the yamls for AWS CNI (Service, ServiceMonitor and the rules )
Side Note: There is an issue with the latest build that is creating all the manifests in manifests directory and you may have to apply the manifests multiple times.
Here are the yamls for adding the AWS VPC CNI Metrics:
---
apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: aws-node
name: aws-node
namespace: kube-system
spec:
clusterIP: None
ports:
- name: cni-metrics-port
port: 61678
targetPort: 61678
selector:
k8s-app: aws-node
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
k8s-app: eks-cni
name: awsekscni
namespace: monitoring
spec:
endpoints:
- interval: 30s
path: /metrics
port: cni-metrics-port
jobLabel: k8s-app
namespaceSelector:
matchNames:
- kube-system
selector:
matchLabels:
k8s-app: aws-node
============== Prometheus-rules
- name: kube-prometheus-eks.rules
rules:
- alert: EksAvailableIPs
annotations:
message: Instance {{ $labels.instance }} has less than 10 IPs available.
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-eksavailableips
expr: sum by(instance) (awscni_total_ip_addresses) - sum by(instance) (awscni_assigned_ip_addresses)
< 10
for: 10m
labels:
severity: critical
Do you feel there is anything additional in our setup that should be added or additionally documented? If you could contribute those, then that would be immensely helpful for the rest of the community! :)
Here are the yamls for adding the AWS VPC CNI Metrics:
--- apiVersion: v1 kind: Service metadata: labels: k8s-app: aws-node name: aws-node namespace: kube-system spec: clusterIP: None ports: - name: cni-metrics-port port: 61678 targetPort: 61678 selector: k8s-app: aws-node --- apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: labels: k8s-app: eks-cni name: awsekscni namespace: monitoring spec: endpoints: - interval: 30s path: /metrics port: cni-metrics-port jobLabel: k8s-app namespaceSelector: matchNames: - kube-system selector: matchLabels: k8s-app: aws-node
Hello,
I just tried to apply it but didn't works.
I applied the chart eks/cni-metrics-helper (0.1.18) (USE_CLOUDWATCH = false) & the mentioned service/servicemonitor but unable to access to these metrics.
ServiceMonitor also did not work for me.
What worked for me was to patch my aws-node ds like:
kubectl patch daemonset aws-node -n kube-system --patch '
spec:
template:
metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/path: "/metrics"
prometheus.io/port: "61678"
'
# To Remove
kubectl patch daemonset aws-node -n kube-system --patch '
spec:
template:
metadata:
annotations:
prometheus.io/scrape: "null"
prometheus.io/path: "null"
prometheus.io/port: "null"
'
After applying the patch I could see the aws-node-xxx targets under kubernetes-pods from the prometheus ui.
ServiceMonitor also did not work for me.
What worked for me was to patch my aws-node ds like:
kubectl patch daemonset aws-node -n kube-system --patch ' spec: template: metadata: annotations: prometheus.io/scrape: "true" prometheus.io/path: "/metrics" prometheus.io/port: "61678" ' # To Remove kubectl patch daemonset aws-node -n kube-system --patch ' spec: template: metadata: annotations: prometheus.io/scrape: "null" prometheus.io/path: "null" prometheus.io/port: "null" '
After applying the patch I could see the aws-node-xxx targets under kubernetes-pods from the prometheus ui.
I use terraform so unable to use patch without hack. Like workaround, I added a new job that scrape cni pod and it works like a charm. I don't understand why it's not clearly documented...