crane icon indicating copy to clipboard operation
crane copied to clipboard

ehpa一直没有预测数据

Open slitobo opened this issue 2 years ago • 2 comments

k8s版本:v1.18.4 crane版本:0.10 问题: ehpa一直没有预测数据,状态显示: not all metric predicted TSP 状态

Spec:
  Prediction Metrics:
    Algorithm:
      Algorithm Type:  dsp
      Dsp:
        Estimators:
        History Length:   3d
        Sample Interval:  60s
    Expression Query:
      Expression:             sum(irate(container_cpu_usage_seconds_total{namespace="default",pod=~"^debug-app-[a-z0-9]+-[a-z0-9]{5}$",container!=""}[3m]))
    Resource Identifier:      resource.cpu
    Type:                     ExpressionQuery
  Prediction Window Seconds:  3600
  Target Ref:
    API Version:  apps/v1
    Kind:         Deployment
    Name:         debug-app
    Namespace:    default
Status:
  Conditions:
    Last Transition Time:  2023-04-03T03:14:33Z
    Message:               not all metric predicted
    Reason:                PredictPartial
    Status:                False
    Type:                  Ready
  Prediction Metrics:
    Ready:                false
    Resource Identifier:  resource.cpu
Events:                   <none>
# kubectl get apiservice | grep 'metrics'
v1beta1.custom.metrics.k8s.io               kube-system/hpa-metrics-service   True        571d
v1beta1.external.metrics.k8s.io             crane-system/metric-adapter       True        13d

# kubectl get --raw /apis/external.metrics.k8s.io/v1beta1 | jq .
{
  "kind": "APIResourceList",
  "apiVersion": "v1",
  "groupVersion": "external.metrics.k8s.io/v1beta1",
  "resources": [
    {
      "name": "crane_autoscaling_cron",
      "singularName": "",
      "namespaced": true,
      "kind": "ExternalMetricValueList",
      "verbs": [
        "get"
      ]
    },
    {
      "name": "crane_autoscaling_prediction",
      "singularName": "",
      "namespaced": true,
      "kind": "ExternalMetricValueList",
      "verbs": [
        "get"
      ]
    }
  ]
}


# kubectl get ehpa
NAME        STRATEGY   MINPODS   MAXPODS   SPECIFICPODS   REPLICAS   AGE
debug-app   Preview    1         10                       1          5d13h
# kubectl describe ehpa  debug-app
Name:         debug-app
Namespace:    default
Labels:       <none>
Annotations:  autoscaling.crane.io/effective-hpa-current-metrics:
                - resource:
                    current:
                      averageUtilization: 1
                      averageValue: "0"
                    name: cpu
                  type: Resource
API Version:  autoscaling.crane.io/v1alpha1
Kind:         EffectiveHorizontalPodAutoscaler
Metadata:
  Creation Timestamp:  2023-03-28T13:34:12Z
  Generation:          5
  ......
Spec:
  Max Replicas:  10
  Metrics:
    Resource:
      Name:  cpu
      Target:
        Average Utilization:  50
        Type:                 Utilization
    Type:                     Resource
  Min Replicas:               1
  Prediction:
    Prediction Algorithm:
      Algorithm Type:  dsp
      Dsp:
        Estimators:
        History Length:         3d
        Sample Interval:        60s
    Prediction Window Seconds:  3600
  Scale Strategy:               Preview
  Scale Target Ref:
    API Version:  apps/v1
    Kind:         Deployment
    Name:         debug-app
Status:
  Conditions:
    Last Transition Time:  2023-04-03T03:13:13Z
    Message:               Effective HPA is ready
    Reason:                EffectiveHorizontalPodAutoscalerReady
    Status:                True
    Type:                  Ready
    Last Transition Time:  2023-04-03T03:13:13Z
    Message:               not all metric predicted
    Reason:                PredictPartial
    Status:                False
    Type:                  PredictionReady
    Last Transition Time:  2023-04-03T03:13:13Z
    Message:               the HPA controller was able to update the target scale to 1
    Reason:                SucceededRescale
    Status:                True
    Type:                  AbleToScale
    Last Transition Time:  2023-04-03T03:13:13Z
    Message:               the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
    Reason:                ValidMetricFound
    Status:                True
    Type:                  ScalingActive
    Last Transition Time:  2023-04-03T03:13:13Z
    Message:               the desired count is within the acceptable range
    Reason:                DesiredWithinRange
    Status:                False
    Type:                  ScalingLimited
  Current Replicas:        9
  Expect Replicas:         1
Events:                    <none>

可以看到监控是有数据的 image

pod是每半小时有高峰 image

slitobo avatar Apr 03 '23 05:04 slitobo

脉冲型的的流量压力可能无法预测,每次模拟压力的时间需要长一些

Pulse-type flow pressure may not be predictable, and each simulation of pressure requires a longer time.

qmhu avatar Apr 04 '23 02:04 qmhu

脉冲型的流量压力可能无法预测,每次模拟压力的时间需要长一些

脉冲式流动压力可能无法预测,并且每次模拟压力都需要更长的时间。

好的,感谢。

slitobo avatar Apr 04 '23 03:04 slitobo