Option to customize AKS Managed Istio add ingress gateways
Is your feature request related to a problem? Please describe.
Currently when enabling ingress gateway (external and internal) we dont have any option to customize. For example I want to create the internal ingress-gateway in a different subnet by providing an annotation service.beta.kubernetes.io/azure-load-balancer-internal-subnet: "internal-subnet"
I want to add a static-ip or an existing ip to the external load balancer which is not possible as of now.
Describe the solution you'd like Ability to customize the internal and external ingressgateway by a configmap or cli (terraform support will be good)[https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/kubernetes_cluster#service_mesh_profile]
Describe alternatives you've considered NA
+1
Would be a nice addition.
@m-amaresh, @Pindar - we are currently planning for ingress customizations through the upcoming Gateway API + Istio experiences. Currently discussing the plans for this internally. Once we have the final details and clear ETA, will share an update here.
+1
@m-amaresh @talex-zeiss @Pindar @williamohara the following annotations can now be added to the Istio ingress K8s service:
service.beta.kubernetes.io/azure-load-balancer-internal-subnet: to bind an internal ingress gateway to a specific subnet.service.beta.kubernetes.io/azure-shared-securityrule: for exposing the ingress gateway through an augmented security rule.service.beta.kubernetes.io/azure-allowed-service-tags: for specifying which service tags the ingress gateway can receive requests from.service.beta.kubernetes.io/azure-load-balancer-ipv4: for configuring a static IPv4 address.service.beta.kubernetes.io/azure-load-balancer-resource-group: for specifying the resource group of a public IP in a different resource group from the cluster.service.beta.kubernetes.io/azure-pip-name: for specifying the name of a public IP address.
@nshankar13 Thanks for your response, I was expecting API support while creating or updating the cluster. API Reference
The managed Istio can be installed while creating the cluster, customizing the ingress gateway configurations manually after cluster creation does not make sense to me. If API support added then we can configure the settings while creating the cluster or updating it.
for example using terraform:
resource "azurerm_kubernetes_cluster" "example" {
name = "example-aks1"
location = azurerm_resource_group.example.location
resource_group_name = azurerm_resource_group.example.name
dns_prefix = "exampleaks1"
default_node_pool {
name = "default"
node_count = 1
vm_size = "Standard_D2_v2"
}
identity {
type = "SystemAssigned"
}
service_mesh_profile {
mode = "Istio"
revisions = ["asm-1-20", "asm-1-21"]
internal_ingress_gateway_enabled = true
internal_ingress_gateway_subnet = "some-subnet"
internal_ingress_gateway_ip = "10.1.0.4"
# Other configs as required
external_ingress_gateway_enabled = true
external_ingress_gateway_ip = "existing_pip_name"
# Other configs as required
}
}
Once API support added this can be done via Azure Portal or via Terraform. (azurerm provider will be also updated to support the new params in API)
Hi @nshankar13 - Is the addition of annotations has to be done manually by execute kubectl commands inside the cluster? or is there any shared configmap like the one for meshconfig, so that once we add there then the ingress pods will get these annotations?
Is it available in asm-1-21 or 1-22 or 1-23?
Hello @shashankbarsin We are currently using the self-managed Istio model and are looking to transition to the managed Istio add-on. Our services rely heavily on the IstioOperator model and override many configurations.
Is there a plan to allow configuring more settings in the managed add-on? Below is the istioOperatorSpec for one of our services as an example.
# Override IstioOperator.yaml file
global:
istioOperatorSpec:
spec:
profile: "default"
hub: sample.azurecr.io/istio
meshConfig:
defaultConfig:
holdApplicationUntilProxyStarts: true
trustDomain: samplecluster
components:
ingressGateways:
- name: istio-ingressgateway
enabled: true
k8s:
readinessProbe:
failureThreshold: 5
httpGet:
path: /healthz/ready
port: 15021
initialDelaySeconds: 5
periodSeconds: 2
successThreshold: 3
timeoutSeconds: 1
env:
- name: ISTIO_META_ROUTER_MODE
value: "sni-dnat"
- name: ISTIO_META_IDLE_TIMEOUT
value: 960s
service:
ports:
- port: 443
targetPort: 8443
name: https
protocol: TCP
hpaSpec:
minReplicas: __InfraHelmConfiguration-IstioOperator-IngressGateways-K8sHpa-MinReplicas__
maxReplicas: __InfraHelmConfiguration-IstioOperator-IngressGateways-K8sHpa-MaxReplicas__
# NOTE: do not remove the parameters below. these are the defaults for this istio component. these were added because configmerge will
# turn hpaSpec into null if the optional parameters aren't set without it. if hpaSec becomes null, then it blows away all of these
# values in the parent helm chart.
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: istio-ingressgateway
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 60
resources:
requests:
cpu: 2000m
memory: 512Mi
limits:
# Istio will not allow blank CPU limit, and defaults to 2000m. Setting explicitly.
cpu: 8000m
memory: 1Gi
strategy:
rollingUpdate:
maxSurge: "100%"
maxUnavailable: "25%"
overlays:
- kind: Deployment
name: istio-ingressgateway
patches:
- path: spec.template.spec.dnsConfig
value:
options:
- name: single-request-reopen
- path: spec.template.spec.topologySpreadConstraints
value:
- maxSkew: 1
topologyKey: "kubernetes.io/hostname"
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- istio-ingressgateway
pilot:
enabled: true
k8s:
resources:
requests:
cpu: 4000m
memory: 2Gi
limits:
cpu: 4000m
memory: 2Gi
hpaSpec:
minReplicas: __InfraHelmConfiguration-IstioOperator-Pilot-K8sHpa-MinReplicas__
maxReplicas: __InfraHelmConfiguration-IstioOperator-Pilot-K8sHpa-MaxReplicas__
overlays:
- kind: Deployment
name: istiod
patches:
- path: spec.template.spec.dnsConfig
value:
options:
- name: single-request-reopen
cni:
values:
global:
multiCluster:
enabled: true
clusterName: samplecluster
globalDomainSuffix: "global"
includeEnvoyFilter: true
defaultNodeSelector:
kubernetes.io/os: linux
proxy:
resources:
requests:
cpu: "500m"
memory: 1Gi
limits:
# Istio will not allow blank CPU limit, and defaults to 2000m. Setting explicitly here to be obvious that it is not actually unlimited.
cpu: "3000m"
memory: 1Gi
# See details here https://github.com/istio/istio/issues/26923
pilot:
env:
# pilot is replaced by istiod
# Istio 1.3 added trust domain validation by default, but this breaks multi-trust domain scenarios, so we disable it
PILOT_SKIP_VALIDATE_TRUST_DOMAIN: true
# security: # security field is deprecated in 1.6
# By default Istio creates a self signed root certificate authority. If you are creating a multi-cluster mesh
# (see Section 6), you should set this to false. This will tell Istio to look for your custom Root CA
# certificate. If your mesh is single-cluster, feel free to omit this flag or set it to true.
# selfSigned: false
telemetry:
# v1: # v1 removed, so setting no longer needed
# # disable mixer (telemetry v1) for istio 1.8+
# enabled: false
v2:
# the default option for istio 1.8+
enabled: true
gateways:
istio-ingressgateway:
# preserve original client source IP, https://istio.io/latest/docs/tasks/security/authorization/authz-ingress/
externalTrafficPolicy: Local
ports:
# We could enable this port if mtls is enabled on the all exposed ports but upstream load balancer requires an endpoint for health check
# Istio/Envoy provides this port for default health check
# - port: 15020
# targetPort: 15020
# name: status-port
- port: 443
targetPort: 32443
name: https
# This is to create 2 ingress gateway pods at minium if not speficied in "PlatformConfig.Deployment.IstioChartValues.Gateway.MinIngressReplicas"
# According to https://github.com/istio/operator, this is a pass-through API to helm values that we should avoid using in future. Validated "autoscaleMin" would be overwritten by hpaSpec -> minReplicas
autoscaleMin: 2
podAntiAffinityTermLabelSelector:
- key: app
operator: In
values: istio-ingressgateway
topologyKey: "kubernetes.io/os"
cni:
excludeNamespaces:
- istio-system
- kube-system
logLevel: info
@m-amaresh @KarthikDev we will not be exposing staticIP configuration via the API as the planned and preferred method for configuring IP address will be via Gateway API, and we are currently working on the implementation for that - so we are not planning on investing in the API changes since we are prioritizing Gateway API. Also it would be unfeasible to include all possible annotations in that list (pip-name, load-balancer-resource-group, allowed-service-tags, etc) via the API. So right now the only option is to configure the K8s service spec directly.
We understand that this may not be ideal for IaC - but I think with Gateway API this would be possible if you deploy the K8s Gateway via some IaC / GitOps workflow.
And this is available for all supported Istio add-on revisions: https://learn.microsoft.com/en-us/azure/aks/istio-support-policy#service-mesh-add-on-release-calendar.
@shahriaak unfortunately we cannot expose all of those options via the API for customization right now. Some of those options are configurable however via MeshConfig and Telemetry API:
- https://learn.microsoft.com/en-us/azure/aks/istio-meshconfig
- https://learn.microsoft.com/en-us/azure/aks/istio-telemetry
We also allow customization of HPA for Istiod and the ingress gateways:
- https://learn.microsoft.com/en-us/azure/aks/istio-scale#horizontal-pod-autoscaling-customization.
For multi-cluster, we don't expose clusterName, meshId, etc via the API. However, from my understanding that only impacts telemetry generation We are currently discussing the multi-cluster experienced we want to provide for Istio on AKS and how this would work for the add-on vs our upcoming fully managed (hosted control plane) offering. Will update on this soon once we have a better idea of the support scope for multi-cluster.
@nshankar13 Thank you for your reply! Could you please point me to what the current values are for the above spec i shared? We would need to determine whether we can move off from self-managed istio to azure managed istio addon based on that.
@shahriaak you can get the values by running helm get values --all azure-service-mesh-istio-discovery -n aks-istio-system after enabling the add-on - see the example below. Please let us know which fields you need customization for.
global:
autoscalingv2API: true
caAddress: ""
caName: ""
certSigners: []
commonGlobals:
AADTenantID: <>
CCPID: <>
CloudEnvironment: <>
Versions:
Kubernetes: 1.29.7
enableKonnectivity: true
configCluster: false
configValidation: true
defaultPodDisruptionBudget:
enabled: true
defaultResources:
requests:
cpu: 10m
externalIstiod: false
hub: mcr.microsoft.com/oss/istio
imagePullPolicy: IfNotPresent
imagePullSecrets: []
istioNamespace: aks-istio-system
istiod:
enableAnalysis: false
logAsJson: false
logging:
level: default:info
meshID: ""
meshNetworks: {}
mountMtlsCerts: false
multiCluster:
clusterName: ""
enabled: false
network: ""
omitSidecarInjectorConfigMap: false
operatorManageWebhooks: false
pilotCertProvider: istiod
priorityClassName: system-cluster-critical
proxy:
autoInject: enabled
clusterDomain: cluster.local
componentLogLevel: misc:error
enableCoreDump: false
excludeIPRanges: ""
excludeInboundPorts: ""
excludeOutboundPorts: ""
image: proxyv2
includeIPRanges: '*'
includeInboundPorts: '*'
includeOutboundPorts: ""
logLevel: warning
outlierLogPath: ""
privileged: false
readinessFailureThreshold: 4
readinessInitialDelaySeconds: 0
readinessPeriodSeconds: 15
resources:
limits:
cpu: 2000m
memory: 1024Mi
requests:
cpu: 100m
memory: 128Mi
startupProbe:
enabled: true
failureThreshold: 600
statusPort: 15020
tracer: zipkin
proxy_init:
image: proxyv2
remotePilotAddress: ""
sds:
token:
aud: istio-ca
sts:
servicePort: 0
tag: latest
variant: distroless
highestRevision: asm-1-22
istio_cni:
chained: true
provider: default
istiodRemote:
injectionCABundle: ""
injectionPath: /inject
injectionURL: ""
lastReconciliation: "2024-10-02T00:59:44Z"
meshConfig:
defaultConfig:
gatewayTopology:
numTrustedProxies: 1
enablePrometheusMerge: true
rootNamespace: aks-istio-system
ownerName: ""
pilot:
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- preference:
matchExpressions:
- key: kubernetes.azure.com/mode
operator: In
values:
- system
weight: 100
- preference:
matchExpressions:
- key: azureservicemesh/istio.replica.preferred
operator: In
values:
- "true"
weight: 50
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
- key: type
operator: NotIn
values:
- virtual-kubelet
- key: kubernetes.azure.com/cluster
operator: Exists
autoscaleBehavior: {}
autoscaleEnabled: true
autoscaleMax: 5
autoscaleMin: 2
cni:
enabled: false
provider: default
configMap: true
cpu:
targetAverageUtilization: 80
deploymentLabels:
kubernetes.azure.com/managedby: aks
env:
ENABLE_NATIVE_SIDECARS: "true"
PILOT_ENABLE_GATEWAY_API_DEPLOYMENT_CONTROLLER: "false"
extraContainerArgs: []
hub: mcr.microsoft.com/oss/istio
image: pilot
ipFamilies: []
ipFamilyPolicy: ""
jwksResolverExtraRootCA: ""
keepaliveMaxServerConnectionAge: 30m
memory: {}
nodeSelector: {}
podAnnotations: {}
podLabels:
kubernetes.azure.com/managedby: aks
replicaCount: 2
resources:
requests:
cpu: 500m
memory: 2048Mi
rollingMaxSurge: 100%
rollingMaxUnavailable: 25%
seccompProfile: {}
serviceAccountAnnotations: {}
serviceAnnotations: {}
tag: ""
taint:
enabled: false
namespace: ""
tolerations:
- key: CriticalAddonsOnly
operator: Exists
topologySpreadConstraints: []
traceSampling: 1
trustedZtunnelNamespace: ""
variant: distroless
volumeMounts: []
volumes: []
releaseManifests:
- imageTag: 1.22.3
revision: asm-1-22
revision: ""
revisionTags: []
revisions:
- asm-1-22
sidecarInjectorWebhook:
alwaysInjectSelector: []
defaultTemplates: []
enableNamespacesByDefault: false
injectedAnnotations: {}
neverInjectSelector: []
reinvocationPolicy: Never
rewriteAppHTTPProbe: true
templates: {}
telemetry:
enabled: true
v2:
enabled: true
prometheus:
enabled: true
stackdriver:
enabled: false
@shahriaak please open a new Issue for the values customization