AKS icon indicating copy to clipboard operation
AKS copied to clipboard

Option to customize AKS Managed Istio add ingress gateways

Open m-amaresh opened this issue 1 year ago • 6 comments

Is your feature request related to a problem? Please describe. Currently when enabling ingress gateway (external and internal) we dont have any option to customize. For example I want to create the internal ingress-gateway in a different subnet by providing an annotation service.beta.kubernetes.io/azure-load-balancer-internal-subnet: "internal-subnet" I want to add a static-ip or an existing ip to the external load balancer which is not possible as of now.

Describe the solution you'd like Ability to customize the internal and external ingressgateway by a configmap or cli (terraform support will be good)[https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/kubernetes_cluster#service_mesh_profile]

Describe alternatives you've considered NA

m-amaresh avatar Jul 29 '24 10:07 m-amaresh

+1

talex-zeiss avatar Jul 29 '24 10:07 talex-zeiss

Would be a nice addition.

Pindar avatar Jul 29 '24 11:07 Pindar

@m-amaresh, @Pindar - we are currently planning for ingress customizations through the upcoming Gateway API + Istio experiences. Currently discussing the plans for this internally. Once we have the final details and clear ETA, will share an update here.

shashankbarsin avatar Aug 08 '24 17:08 shashankbarsin

+1

williamohara avatar Sep 16 '24 16:09 williamohara

@m-amaresh @talex-zeiss @Pindar @williamohara the following annotations can now be added to the Istio ingress K8s service:

  • service.beta.kubernetes.io/azure-load-balancer-internal-subnet: to bind an internal ingress gateway to a specific subnet.
  • service.beta.kubernetes.io/azure-shared-securityrule: for exposing the ingress gateway through an augmented security rule.
  • service.beta.kubernetes.io/azure-allowed-service-tags: for specifying which service tags the ingress gateway can receive requests from.
  • service.beta.kubernetes.io/azure-load-balancer-ipv4: for configuring a static IPv4 address.
  • service.beta.kubernetes.io/azure-load-balancer-resource-group: for specifying the resource group of a public IP in a different resource group from the cluster.
  • service.beta.kubernetes.io/azure-pip-name: for specifying the name of a public IP address.

nshankar13 avatar Sep 23 '24 12:09 nshankar13

@nshankar13 Thanks for your response, I was expecting API support while creating or updating the cluster. API Reference

The managed Istio can be installed while creating the cluster, customizing the ingress gateway configurations manually after cluster creation does not make sense to me. If API support added then we can configure the settings while creating the cluster or updating it.

for example using terraform:

resource "azurerm_kubernetes_cluster" "example" {
  name                = "example-aks1"
  location            = azurerm_resource_group.example.location
  resource_group_name = azurerm_resource_group.example.name
  dns_prefix          = "exampleaks1"

  default_node_pool {
    name       = "default"
    node_count = 1
    vm_size    = "Standard_D2_v2"
  }

  identity {
    type = "SystemAssigned"
  }

  service_mesh_profile {
    mode                             = "Istio"
    revisions                        = ["asm-1-20", "asm-1-21"]
    internal_ingress_gateway_enabled = true
    internal_ingress_gateway_subnet  = "some-subnet"
    internal_ingress_gateway_ip      = "10.1.0.4"
    # Other configs as required

    external_ingress_gateway_enabled = true
    external_ingress_gateway_ip      = "existing_pip_name"
    # Other configs as required
  }
}

Once API support added this can be done via Azure Portal or via Terraform. (azurerm provider will be also updated to support the new params in API)

m-amaresh avatar Sep 23 '24 14:09 m-amaresh

Hi @nshankar13 - Is the addition of annotations has to be done manually by execute kubectl commands inside the cluster? or is there any shared configmap like the one for meshconfig, so that once we add there then the ingress pods will get these annotations?

Is it available in asm-1-21 or 1-22 or 1-23?

KarthikDev avatar Sep 25 '24 07:09 KarthikDev

Hello @shashankbarsin We are currently using the self-managed Istio model and are looking to transition to the managed Istio add-on. Our services rely heavily on the IstioOperator model and override many configurations.

Is there a plan to allow configuring more settings in the managed add-on? Below is the istioOperatorSpec for one of our services as an example.

# Override IstioOperator.yaml file
global:
 istioOperatorSpec:
   spec:
     profile: "default"
     hub: sample.azurecr.io/istio

     meshConfig:
       defaultConfig:

         holdApplicationUntilProxyStarts: true
       trustDomain: samplecluster

     components:
       ingressGateways:
       - name: istio-ingressgateway
         enabled: true
         k8s:
           readinessProbe:
             failureThreshold: 5
             httpGet:
               path: /healthz/ready
               port: 15021
             initialDelaySeconds: 5
             periodSeconds: 2
             successThreshold: 3
             timeoutSeconds: 1
           env:
           - name: ISTIO_META_ROUTER_MODE
             value: "sni-dnat"
           - name: ISTIO_META_IDLE_TIMEOUT
             value: 960s
           service:
             ports:
               - port: 443
                 targetPort: 8443
                 name: https
                 protocol: TCP
           hpaSpec:
             minReplicas: __InfraHelmConfiguration-IstioOperator-IngressGateways-K8sHpa-MinReplicas__
             maxReplicas: __InfraHelmConfiguration-IstioOperator-IngressGateways-K8sHpa-MaxReplicas__
             # NOTE: do not remove the parameters below. these are the defaults for this istio component. these were added because configmerge will 
             # turn hpaSpec into null if the optional parameters aren't set without it. if hpaSec becomes null, then it blows away all of these
             # values in the parent helm chart.
             scaleTargetRef:
               apiVersion: apps/v1
               kind: Deployment
               name: istio-ingressgateway
             metrics:
               - type: Resource
                 resource:
                   name: cpu
                   target:
                     type: Utilization
                     averageUtilization: 60
           resources:
             requests:
               cpu: 2000m
               memory: 512Mi
             limits:
               # Istio will not allow blank CPU limit, and defaults to 2000m.  Setting explicitly.
               cpu: 8000m
               memory: 1Gi
           strategy:
             rollingUpdate:
               maxSurge: "100%"
               maxUnavailable: "25%"

           overlays:
             - kind: Deployment
               name: istio-ingressgateway
               patches:
                 - path: spec.template.spec.dnsConfig
                   value:
                     options:
                       - name: single-request-reopen
                 - path: spec.template.spec.topologySpreadConstraints
                   value:
                     - maxSkew: 1
                       topologyKey: "kubernetes.io/hostname"
                       whenUnsatisfiable: ScheduleAnyway
                       labelSelector:
                         matchExpressions:
                           - key: app
                             operator: In
                             values:
                             - istio-ingressgateway

       pilot:
         enabled: true
         k8s:
           resources:
             requests:
               cpu: 4000m
               memory: 2Gi
             limits:
               cpu: 4000m
               memory: 2Gi
           hpaSpec:
             minReplicas: __InfraHelmConfiguration-IstioOperator-Pilot-K8sHpa-MinReplicas__
             maxReplicas: __InfraHelmConfiguration-IstioOperator-Pilot-K8sHpa-MaxReplicas__
           overlays:
             - kind: Deployment
               name: istiod
               patches:
                 - path: spec.template.spec.dnsConfig
                   value:
                     options:
                       - name: single-request-reopen
       cni:

     values:
       global:
         multiCluster:
           enabled: true
           clusterName: samplecluster
           globalDomainSuffix: "global"
           includeEnvoyFilter: true
         defaultNodeSelector:
           kubernetes.io/os: linux
         proxy:
           resources:
             requests:
               cpu: "500m"
               memory: 1Gi
             limits:
               # Istio will not allow blank CPU limit, and defaults to 2000m.  Setting explicitly here to be obvious that it is not actually unlimited.
               cpu: "3000m"
               memory: 1Gi
           # See details here https://github.com/istio/istio/issues/26923

       pilot:
         env:
           # pilot is replaced by istiod
           # Istio 1.3 added trust domain validation by default, but this breaks multi-trust domain scenarios, so we disable it
           PILOT_SKIP_VALIDATE_TRUST_DOMAIN: true

       # security:   # security field is deprecated in 1.6
         # By default Istio creates a self signed root certificate authority. If you are creating a multi-cluster mesh
         # (see Section 6), you should set this to false. This will tell Istio to look for your custom Root CA
         # certificate. If your mesh is single-cluster, feel free to omit this flag or set it to true.
         # selfSigned: false

       telemetry:
         # v1: # v1 removed, so setting no longer needed
         #   # disable mixer (telemetry v1) for istio 1.8+
         #   enabled: false
         v2:
           # the default option for istio 1.8+
           enabled: true
       gateways:
         istio-ingressgateway:
           # preserve original client source IP, https://istio.io/latest/docs/tasks/security/authorization/authz-ingress/
           externalTrafficPolicy: Local
           ports:
           # We could enable this port if mtls is enabled on the all exposed ports but upstream load balancer requires an endpoint for health check
           # Istio/Envoy provides this port for default health check
           # - port: 15020
           #   targetPort: 15020
           #   name: status-port
           - port: 443
             targetPort: 32443
             name: https
           # This is to create 2 ingress gateway pods at minium if not speficied in "PlatformConfig.Deployment.IstioChartValues.Gateway.MinIngressReplicas" 
           # According to https://github.com/istio/operator, this is a pass-through API to helm values that we should avoid using in future. Validated "autoscaleMin" would be overwritten by hpaSpec -> minReplicas
           autoscaleMin: 2
           podAntiAffinityTermLabelSelector:
           - key: app
             operator: In
             values: istio-ingressgateway
             topologyKey: "kubernetes.io/os"
       cni:
         excludeNamespaces:
         - istio-system
         - kube-system
         logLevel: info

shahriaak avatar Sep 26 '24 08:09 shahriaak

@m-amaresh @KarthikDev we will not be exposing staticIP configuration via the API as the planned and preferred method for configuring IP address will be via Gateway API, and we are currently working on the implementation for that - so we are not planning on investing in the API changes since we are prioritizing Gateway API. Also it would be unfeasible to include all possible annotations in that list (pip-name, load-balancer-resource-group, allowed-service-tags, etc) via the API. So right now the only option is to configure the K8s service spec directly.

We understand that this may not be ideal for IaC - but I think with Gateway API this would be possible if you deploy the K8s Gateway via some IaC / GitOps workflow.

And this is available for all supported Istio add-on revisions: https://learn.microsoft.com/en-us/azure/aks/istio-support-policy#service-mesh-add-on-release-calendar.

nshankar13 avatar Sep 27 '24 15:09 nshankar13

@shahriaak unfortunately we cannot expose all of those options via the API for customization right now. Some of those options are configurable however via MeshConfig and Telemetry API:

  • https://learn.microsoft.com/en-us/azure/aks/istio-meshconfig
  • https://learn.microsoft.com/en-us/azure/aks/istio-telemetry

We also allow customization of HPA for Istiod and the ingress gateways:

  • https://learn.microsoft.com/en-us/azure/aks/istio-scale#horizontal-pod-autoscaling-customization.

For multi-cluster, we don't expose clusterName, meshId, etc via the API. However, from my understanding that only impacts telemetry generation We are currently discussing the multi-cluster experienced we want to provide for Istio on AKS and how this would work for the add-on vs our upcoming fully managed (hosted control plane) offering. Will update on this soon once we have a better idea of the support scope for multi-cluster.

nshankar13 avatar Sep 27 '24 15:09 nshankar13

@nshankar13 Thank you for your reply! Could you please point me to what the current values are for the above spec i shared? We would need to determine whether we can move off from self-managed istio to azure managed istio addon based on that.

shahriaak avatar Sep 29 '24 09:09 shahriaak

@shahriaak you can get the values by running helm get values --all azure-service-mesh-istio-discovery -n aks-istio-system after enabling the add-on - see the example below. Please let us know which fields you need customization for.

global:
  autoscalingv2API: true
  caAddress: ""
  caName: ""
  certSigners: []
  commonGlobals:
    AADTenantID: <>
    CCPID: <>
    CloudEnvironment: <>
    Versions:
      Kubernetes: 1.29.7
    enableKonnectivity: true
  configCluster: false
  configValidation: true
  defaultPodDisruptionBudget:
    enabled: true
  defaultResources:
    requests:
      cpu: 10m
  externalIstiod: false
  hub: mcr.microsoft.com/oss/istio
  imagePullPolicy: IfNotPresent
  imagePullSecrets: []
  istioNamespace: aks-istio-system
  istiod:
    enableAnalysis: false
  logAsJson: false
  logging:
    level: default:info
  meshID: ""
  meshNetworks: {}
  mountMtlsCerts: false
  multiCluster:
    clusterName: ""
    enabled: false
  network: ""
  omitSidecarInjectorConfigMap: false
  operatorManageWebhooks: false
  pilotCertProvider: istiod
  priorityClassName: system-cluster-critical
  proxy:
    autoInject: enabled
    clusterDomain: cluster.local
    componentLogLevel: misc:error
    enableCoreDump: false
    excludeIPRanges: ""
    excludeInboundPorts: ""
    excludeOutboundPorts: ""
    image: proxyv2
    includeIPRanges: '*'
    includeInboundPorts: '*'
    includeOutboundPorts: ""
    logLevel: warning
    outlierLogPath: ""
    privileged: false
    readinessFailureThreshold: 4
    readinessInitialDelaySeconds: 0
    readinessPeriodSeconds: 15
    resources:
      limits:
        cpu: 2000m
        memory: 1024Mi
      requests:
        cpu: 100m
        memory: 128Mi
    startupProbe:
      enabled: true
      failureThreshold: 600
    statusPort: 15020
    tracer: zipkin
  proxy_init:
    image: proxyv2
  remotePilotAddress: ""
  sds:
    token:
      aud: istio-ca
  sts:
    servicePort: 0
  tag: latest
  variant: distroless
highestRevision: asm-1-22
istio_cni:
  chained: true
  provider: default
istiodRemote:
  injectionCABundle: ""
  injectionPath: /inject
  injectionURL: ""
lastReconciliation: "2024-10-02T00:59:44Z"
meshConfig:
  defaultConfig:
    gatewayTopology:
      numTrustedProxies: 1
  enablePrometheusMerge: true
  rootNamespace: aks-istio-system
ownerName: ""
pilot:
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - preference:
          matchExpressions:
          - key: kubernetes.azure.com/mode
            operator: In
            values:
            - system
        weight: 100
      - preference:
          matchExpressions:
          - key: azureservicemesh/istio.replica.preferred
            operator: In
            values:
            - "true"
        weight: 50
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/os
            operator: In
            values:
            - linux
          - key: type
            operator: NotIn
            values:
            - virtual-kubelet
          - key: kubernetes.azure.com/cluster
            operator: Exists
  autoscaleBehavior: {}
  autoscaleEnabled: true
  autoscaleMax: 5
  autoscaleMin: 2
  cni:
    enabled: false
    provider: default
  configMap: true
  cpu:
    targetAverageUtilization: 80
  deploymentLabels:
    kubernetes.azure.com/managedby: aks
  env:
    ENABLE_NATIVE_SIDECARS: "true"
    PILOT_ENABLE_GATEWAY_API_DEPLOYMENT_CONTROLLER: "false"
  extraContainerArgs: []
  hub: mcr.microsoft.com/oss/istio
  image: pilot
  ipFamilies: []
  ipFamilyPolicy: ""
  jwksResolverExtraRootCA: ""
  keepaliveMaxServerConnectionAge: 30m
  memory: {}
  nodeSelector: {}
  podAnnotations: {}
  podLabels:
    kubernetes.azure.com/managedby: aks
  replicaCount: 2
  resources:
    requests:
      cpu: 500m
      memory: 2048Mi
  rollingMaxSurge: 100%
  rollingMaxUnavailable: 25%
  seccompProfile: {}
  serviceAccountAnnotations: {}
  serviceAnnotations: {}
  tag: ""
  taint:
    enabled: false
    namespace: ""
  tolerations:
  - key: CriticalAddonsOnly
    operator: Exists
  topologySpreadConstraints: []
  traceSampling: 1
  trustedZtunnelNamespace: ""
  variant: distroless
  volumeMounts: []
  volumes: []
releaseManifests:
- imageTag: 1.22.3
  revision: asm-1-22
revision: ""
revisionTags: []
revisions:
- asm-1-22
sidecarInjectorWebhook:
  alwaysInjectSelector: []
  defaultTemplates: []
  enableNamespacesByDefault: false
  injectedAnnotations: {}
  neverInjectSelector: []
  reinvocationPolicy: Never
  rewriteAppHTTPProbe: true
  templates: {}
telemetry:
  enabled: true
  v2:
    enabled: true
    prometheus:
      enabled: true
    stackdriver:
      enabled: false

nshankar13 avatar Oct 02 '24 01:10 nshankar13

@shahriaak please open a new Issue for the values customization

miguelmq avatar Oct 03 '24 17:10 miguelmq