camel-k icon indicating copy to clipboard operation
camel-k copied to clipboard

Invalid replica state for HPA

Open ricardo-oruspay opened this issue 3 years ago • 6 comments

Integration does not have initial state of replica required by HPA.

Simple deploy:

kamel run -n prod-camelk gateway.groovy                                                                                     
Integration "gateway" created

Integration OK:

k describe it gateway -n prod-camelk
Name:         gateway
Namespace:    prod-camelk
Labels:       <none>
Annotations:  <none>
API Version:  camel.apache.org/v1
Kind:         Integration
Metadata:
  Creation Timestamp:  2022-03-23T11:28:17Z
  Generation:          1
  Managed Fields:
    API Version:  camel.apache.org/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:spec:
        .:
        f:dependencies:
        f:sources:
        f:traits:
          .:
          f:container:
            .:
            f:configuration:
              .:
              f:requestCPU:
          f:mount:
            .:
            f:configuration:
              .:
              f:configs:
      f:status:
        .:
        f:capabilities:
        f:conditions:
        f:dependencies:
        f:digest:
        f:image:
        f:integrationKit:
          .:
          f:name:
          f:namespace:
        f:lastInitTimestamp:
        f:phase:
        f:platform:
        f:profile:
        f:replicas:
        f:runtimeProvider:
        f:runtimeVersion:
        f:selector:
        f:version:
    Manager:         kamel
    Operation:       Update
    Time:            2022-03-23T11:28:17Z
  Resource Version:  320716511
  Self Link:         /apis/camel.apache.org/v1/namespaces/prod-camelk/integrations/gateway
  UID:               503de9bd-7304-4338-8f35-3681ccb5f8b5
Spec:
  Dependencies:
    camel:jackson
    camel:http
  Sources:
    Content:  ...
    Name:  gateway.groovy
  Traits:
    Container:
      Configuration:
        Request CPU:  50ms
    Mount:
      Configuration:
        Configs:
          configmap:gateway-config
Status:
  Capabilities:
    rest
  Conditions:
    First Truthy Time:     2022-03-23T11:28:17Z
    Last Transition Time:  2022-03-23T11:28:17Z
    Last Update Time:      2022-03-23T11:28:17Z
    Message:               prod-camelk/camel-k
    Reason:                IntegrationPlatformAvailable
    Status:                True
    Type:                  IntegrationPlatformAvailable
    First Truthy Time:     2022-03-23T11:28:17Z
    Last Transition Time:  2022-03-23T11:28:17Z
    Last Update Time:      2022-03-23T11:28:17Z
    Message:               kit-c8su28hts5e0i2othlgg
    Reason:                IntegrationKitAvailable
    Status:                True
    Type:                  IntegrationKitAvailable
    Last Transition Time:  2022-03-23T11:28:17Z
    Last Update Time:      2022-03-23T11:28:17Z
    Message:               different controller strategy used (deployment)
    Reason:                CronJobNotAvailableReason
    Status:                False
    Type:                  CronJobAvailable
    First Truthy Time:     2022-03-23T11:28:17Z
    Last Transition Time:  2022-03-23T11:28:17Z
    Last Update Time:      2022-03-23T11:28:17Z
    Message:               deployment name is gateway
    Reason:                DeploymentAvailable
    Status:                True
    Type:                  DeploymentAvailable
    First Truthy Time:     2022-03-23T11:28:17Z
    Last Transition Time:  2022-03-23T11:28:17Z
    Last Update Time:      2022-03-23T11:28:17Z
    Message:               gateway(http/80) -> integration(http/8080)
    Reason:                ServiceAvailable
    Status:                True
    Type:                  ServiceAvailable
    Last Transition Time:  2022-03-23T11:28:17Z
    Last Update Time:      2022-03-23T11:28:17Z
    Message:               no host or service defined
    Reason:                IngressNotAvailable
    Status:                False
    Type:                  ExposureAvailable
    First Truthy Time:     2022-03-23T11:28:19Z
    Last Transition Time:  2022-03-23T11:28:19Z
    Last Update Time:      2022-03-23T11:28:19Z
    Message:               1/1 ready replicas
    Reason:                DeploymentReady
    Status:                True
    Type:                  Ready
  Dependencies:
    camel:http
    camel:jackson
    mvn:org.apache.camel.k:camel-k-runtime
    mvn:org.apache.camel.quarkus:camel-quarkus-groovy-dsl
    mvn:org.apache.camel.quarkus:camel-quarkus-platform-http
    mvn:org.apache.camel.quarkus:camel-quarkus-rest
  Digest:  vQLk6v_KmBpskZftDXC4pi4OMyy7vJPBoylWqmY_2u9E
  Image:   .../camel-k-kit-c8su28hts5e0i2othlgg@sha256:bc22a992c7c9345bb7be83feae553d4c18805498cfc3915a8918a81032b662a7
  Integration Kit:
    Name:               kit-c8su28hts5e0i2othlgg
    Namespace:          prod-camelk
  Last Init Timestamp:  2022-03-23T11:28:17Z
  Phase:                Running
  Platform:             camel-k
  Profile:              Kubernetes
  Replicas:             1
  Runtime Provider:     quarkus
  Runtime Version:      1.12.0
  Selector:             camel.apache.org/integration=gateway
  Version:              1.8.2
Events:
  Type    Reason                       Age                    From                            Message
  ----    ------                       ----                   ----                            -------
  Normal  IntegrationConditionChanged  2m39s                  camel-k-integration-controller  Condition "IntegrationPlatformAvailable" is "True" for Integration gateway: prod-camelk/camel-k
  Normal  IntegrationPhaseUpdated      2m39s                  camel-k-integration-controller  Integration "gateway" in phase "Initialization"
  Normal  IntegrationPhaseUpdated      2m39s                  camel-k-integration-controller  Integration "gateway" in phase "Building Kit"
  Normal  IntegrationConditionChanged  2m39s                  camel-k-integration-controller  Condition "IntegrationKitAvailable" is "True" for Integration gateway: kit-c8su28hts5e0i2othlgg
  Normal  IntegrationPhaseUpdated      2m39s                  camel-k-integration-controller  Integration "gateway" in phase "Deploying"
  Normal  IntegrationConditionChanged  2m39s (x2 over 2m39s)  camel-k-integration-controller  Condition "CronJobAvailable" is "False" for Integration gateway: different controller strategy used (deployment)
  Normal  IntegrationConditionChanged  2m39s (x2 over 2m39s)  camel-k-integration-controller  Condition "DeploymentAvailable" is "True" for Integration gateway: deployment name is gateway
  Normal  IntegrationConditionChanged  2m39s (x2 over 2m39s)  camel-k-integration-controller  Condition "ServiceAvailable" is "True" for Integration gateway: gateway(http/80) -> integration(http/8080)
  Normal  IntegrationConditionChanged  2m39s (x2 over 2m39s)  camel-k-integration-controller  Condition "ExposureAvailable" is "False" for Integration gateway: no host or service defined
  Normal  IntegrationConditionChanged  2m39s (x2 over 2m39s)  camel-k-integration-controller  Condition "Ready" is "False" for Integration gateway: 0/1 updated replicas
  Normal  IntegrationPhaseUpdated      2m39s (x2 over 2m39s)  camel-k-integration-controller  Integration "gateway" in phase "Running"
  Normal  IntegrationConditionChanged  2m38s (x2 over 2m38s)  camel-k-integration-controller  Condition "Ready" is "False" for Integration gateway: 0/1 ready replicas
  Normal  IntegrationConditionChanged  2m37s (x2 over 2m37s)  camel-k-integration-controller  Condition "Ready" is "True" for Integration gateway: 1/1 ready replicas

HPA spec:

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: gateway-autoscale
  namespace: prod-camelk
spec:
  scaleTargetRef:
    apiVersion: camel.apache.org/v1
    kind: Integration
    name: gateway
  minReplicas: 3
  maxReplicas: 3
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 80

HPA can not find replicas field:


k describe hpa gateway-autoscale -n prod-camelk
Name:                                                  gateway-autoscale
Namespace:                                             prod-camelk
Labels:                                                <none>
Annotations:                                           <none>
CreationTimestamp:                                     Wed, 23 Mar 2022 08:37:22 -0300
Reference:                                             Integration/gateway
Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  <unknown> / 80%
Min replicas:                                          3
Max replicas:                                          3
Integration pods:                                      0 current / 0 desired
Conditions:
  Type         Status  Reason          Message
  ----         ------  ------          -------
  AbleToScale  False   FailedGetScale  the HPA controller was unable to get the target's current scale: Internal error occurred: the spec replicas field ".spec.replicas" does not exist
Events:
  Type     Reason          Age   From                       Message
  ----     ------          ----  ----                       -------
  Warning  FailedGetScale  13s   horizontal-pod-autoscaler  Internal error occurred: the spec replicas field ".spec.replicas" does not exist

The workaround is force any scale:

kubectl scale it gateway -n prod-camelk --replicas 1                                                                        
integration.camel.apache.org/gateway scaled

Then HPA start working:

k describe hpa gateway-autoscale -n prod-camelk                                                                             
Name:                                                  gateway-autoscale
Namespace:                                             prod-camelk
Labels:                                                <none>
Annotations:                                           <none>
CreationTimestamp:                                     Wed, 23 Mar 2022 09:03:54 -0300
Reference:                                             Integration/gateway
Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  <unknown> / 80%
Min replicas:                                          3
Max replicas:                                          3
Integration pods:                                      3 current / 3 desired
Conditions:
  Type           Status  Reason                   Message
  ----           ------  ------                   -------
  AbleToScale    True    SucceededGetScale        the HPA controller was able to get the target's current scale
Events:
  Type     Reason                        Age               From                       Message
  ----     ------                        ----              ----                       -------
  Normal   SuccessfulRescale             32s               horizontal-pod-autoscaler  New size: 3; reason: Current number of replicas below Spec.MinReplicas

Version: Camel K 1.8.2

ricardo-oruspay avatar Mar 23 '22 12:03 ricardo-oruspay

I had a look at this and the most immediate solution I can think is to have a default for .spec.replicas set to 1 when creating an Integration. However, I prefer keeping an empty value as this is interpreted and translated into .status.replicas correctly. I think the problem is more on the HPA side which should either evaluate a missing replica value as 1 or better look at the .status.replica.

squakez avatar Aug 10 '22 10:08 squakez

I've checked with the K8S team and it seems to be an HPA API requirement. @astefanutti @tadayosi would you see any harm to set the default of .spec.replicas to 1 if it was not specified? I am not sure it is a good practice to alter any value of the spec on behalf of the user.

squakez avatar Sep 02 '22 08:09 squakez

Yes it seems KEDA also requires the .spec.replicas to be set. Some context has been shared in https://github.com/apache/camel-k/pull/2838#discussion_r783859878.

I agree, the .spec block should not be updated by the operator. One approach could be to use defaulting in the CRD. But that'd require some adjustments in the Knative trait, so it plays well with Knative autoscaling.

In the short term, before a long term solution is found, we could also advocate to document that for HPA to work, users have to set the replicas explicitly on the integrations. I'd rather not patch Camel K here and there to accommodate how each autoscaler interprets the scale sub-resource specification and derive their own requirements.

astefanutti avatar Sep 02 '22 09:09 astefanutti

Agreed. Let's turn this into a documentation request then. Thanks for the feedback!

squakez avatar Sep 02 '22 09:09 squakez

For many cases defaulting to .spec.replicas=1 seems to make sense. What if we set up .spec.replicas on an integration only when Knative profile is not chosen, or Knative Service trait is not applied?

tadayosi avatar Sep 05 '22 06:09 tadayosi

For many cases defaulting to .spec.replicas=1 seems to make sense. What if we set up .spec.replicas on an integration only when Knative profile is not chosen, or Knative Service trait is not applied?

The defaulting proposed above is at the CRD level, as in https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#defaulting. It has the advantage that it avoids the operator touching the spec block which should be owned by users, but has the "disadvantage" that it's static. I'm quoting disadvantage, because I could almost argue it's an advantage, i.e., having a sensible, consistent default, that's not dynamic because it must accommodate interpretation / implementation of auto-scalers.

astefanutti avatar Sep 05 '22 07:09 astefanutti

This issue has been automatically marked as stale due to 90 days of inactivity. It will be closed if no further activity occurs within 15 days. If you think that’s incorrect or the issue should never stale, please simply write any comment. Thanks for your contributions!

github-actions[bot] avatar Dec 05 '22 00:12 github-actions[bot]