[kube-prometheus-stack] prometheus-node-exporter fails to deploy on AKS with virtual nodes
The prometheus-node-exporter fails to deploy on the virtual aci node of Azure Kubernetes Service (AKS) generating the following event entry:
kind: Event
apiVersion: events.k8s.io/v1
metadata:
name: prometheus-prometheus-node-exporter-djsgh.1782f6d633e12ccb
namespace: monitoring
uid: ad206868-b7c1-4fe3-b1f6-fb976b8ee079
resourceVersion: '88432'
creationTimestamp: '2023-09-08T15:43:29Z'
managedFields:
- manager: virtual-kubelet
operation: Update
apiVersion: v1
time: '2023-09-08T15:46:14Z'
fieldsType: FieldsV1
fieldsV1:
f:count: {}
f:firstTimestamp: {}
f:involvedObject: {}
f:lastTimestamp: {}
f:message: {}
f:reason: {}
f:source:
f:component: {}
f:type: {}
eventTime: null
reason: ProviderCreateFailed
regarding:
kind: Pod
namespace: monitoring
name: prometheus-prometheus-node-exporter-djsgh
uid: 554930c6-0a0b-4772-8311-eb0860c6660d
apiVersion: v1
note: >-
ACI does not support providing args without specifying the command. Please
supply both command and args to the pod spec.
type: Warning
deprecatedSource:
component: virtual-node-aci-linux/pod-controller
deprecatedFirstTimestamp: '2023-09-08T15:43:29Z'
deprecatedLastTimestamp: '2023-09-08T15:46:14Z'
deprecatedCount: 15
What's your helm version?
version.BuildInfo{Version:"v3.12.2", GitCommit:"1e210a2c8cc5117d1055bfaa5d40f51bbc2e345e", GitTreeState:"clean", GoVersion:"go1.20.5"}
What's your kubectl version?
Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.3", GitCommit:"25b4e43193bcda6c7328a6d147b1fb73a33f1598", GitTreeState:"clean", BuildDate:"2023-06-14T09:53:42Z", GoVersion:"go1.20.5", Compiler:"gc", Platform:"linux/amd64"} Kustomize Version: v5.0.1 Server Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.3", GitCommit:"25b4e43193bcda6c7328a6d147b1fb73a33f1598", GitTreeState:"clean", BuildDate:"2023-06-19T16:10:31Z", GoVersion:"go1.20.5", Compiler:"gc", Platform:"linux/amd64"}
Which chart?
kube-prometheus-stack
What's the chart version?
50.3.1
What happened?
No response
What you expected to happen?
No response
How to reproduce it?
No response
Enter the changed values of values.yaml?
NONE
Enter the command that you execute and failing/misfunctioning.
GROUP="test"
CLUSTERNAME="test"
LOCATION="eastus"
K8SVERSION="1.27.3"
az network vnet create \
--resource-group $GROUP \
--name $VNETNAME \
--address-prefixes 10.224.0.0/12 \
--subnet-name default \
--subnet-prefix 10.224.0.0/16
az network vnet subnet create \
--resource-group $GROUP \
--vnet-name $VNETNAME \
--name virtual-node-aci \
--address-prefixes 10.239.0.0/16
AKSSUBNETID=$(az network vnet subnet show -g $GROUP --vnet-name $VNETNAME --name default --query [id] --output tsv)
az aks create \
--resource-group $GROUP \
--name $CLUSTERNAME \
--kubernetes-version $K8SVERSION \
--vnet-subnet-id $AKSSUBNETID \
--aci-subnet-name virtual-node-aci \
--network-plugin azure \
--network-policy calico \
--enable-addons virtual-node
az aks get-credentials --resource-group $GROUP --name $CLUSTERNAME
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add stable https://charts.helm.sh/stable
helm repo update
helm install --create-namespace --namespace monitoring prometheus prometheus-community/kube-prometheus-stack
Anything else we need to know?
No response
The workaround I'm currently using is to prevent node-exporter from being deployed to virtual nodes by applying the following custom values to the chart:
prometheus-node-exporter:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: type
operator: NotIn
values:
- virtual-kubelet
for aws fargate:
prometheus-node-exporter:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: eks.amazonaws.com/compute-type
operator: NotIn
values:
- fargate
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
This issue is being automatically closed due to inactivity.