AKS icon indicating copy to clipboard operation
AKS copied to clipboard

[BUG] delegated subnets are case sensitive

Open tyler-lloyd opened this issue 1 year ago • 3 comments

Describe the bug

When the same podSubnetID is used between multiple agent pools, if the casing is different between agent pools then pod networking will break when the most recent agent pool is deleted even if other agent pools remain up.

This same result can happen when using BYO subnet with API Server VNet integration.

To Reproduce Steps to reproduce the behavior:

  1. create a cluster using podSubnetID, i.e. Dynamic IP Allocation for Azure CNI
  2. add a new agent pool use the same podSubnetID but changing the casing of the subnet name, e.g. podsubnet to PodSubnet

Expected behavior

AKS should be case insensitive to any Azure resource ID.

Screenshots

n/a

Environment (please complete the following information):

  • any cluster using podsubnetid or apiserver vnet integration with BYO subnet.

Additional context Add any other context about the problem here.

tyler-lloyd avatar Jun 10 '24 17:06 tyler-lloyd

It looks like this issue is not reproducible via az-cli. The reproduce script I used is below:

# AKS issue 4346
# Basic parameter
ranNum=$(echo $RANDOM)
rG=aks-subcase-${ranNum}
aks=aks-subcase-${ranNum}
vnet=aks-vnet
location=southeastasia

echo "Your resource group will be: ${rG}"
az group create -n ${rG} -l ${location} -o none

# Preparing VNet
az network vnet create -g ${rG} -n ${vnet} --address-prefixes 10.208.0.0/12 -o none 
az network vnet subnet create -n nodesubnet1 -g ${rG} --vnet-name ${vnet} --address-prefixes 10.208.0.0/24 -o none --no-wait
az network vnet subnet create -n podsubnet1 -g ${rG} --vnet-name ${vnet} --address-prefixes 10.210.0.0/24 -o none 

vnetId=$(az resource list -g ${rG} \
    --resource-type Microsoft.Network/virtualNetworks \
    --query [0].id -o tsv)

# Create AKS
az aks create -n ${aks} -g ${rG} \
    --no-ssh-key -o none \
    --nodepool-name agentpool \
    --node-os-upgrade-channel None \
    --node-count 1 \
    --node-vm-size Standard_A4_v2 \
    --network-plugin azure \
    --vnet-subnet-id ${vnetId}/subnets/nodesubnet1 \
    --pod-subnet-id ${vnetId}/subnets/podsubnet1

# Add new user nodepool with subnet but different case
az aks nodepool add --cluster-name ${aks} -g ${rG} -n userpool \
    --mode User \
    --node-count 2 \
    --vnet-subnet-id ${vnetId}/subnets/NodeSubnet1 \
    --pod-subnet-id ${vnetId}/subnets/PodSubnet1 \
    -o none 

az aks nodepool delete --cluster-name ${aks} -g ${rG} -n userpool

# Deploy example deployment
az aks get-credentials -n ${aks} -g ${rG}

cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80
EOF

kubectl get po -w

And it looks fine: image

Can you provide more info like below:

  • How you deploy the AKS? Via terraform or else? What you have put when creating AKS? (like VNet Integration, built-in add-ons, etc)
  • Current AKS version
  • What exact "pod networking will break" you are referring to? What it looks like?
  • If you have opened any support ticket, you can post it here.

JoeyC-Dev avatar Jun 11 '24 06:06 JoeyC-Dev

@tyler-lloyd could you please review comment from @JoeyC-Dev and confirm if this is till an bug?

AllenWen-at-Azure avatar Aug 29 '24 10:08 AllenWen-at-Azure

This issue will now be closed because it hasn't had any activity for 7 days after stale. tyler-lloyd feel free to comment again on the next 7 days to reopen or open a new issue after that time if you still have a question/issue or suggestion.