autoscaler icon indicating copy to clipboard operation
autoscaler copied to clipboard

Setting custom threshold computes bad range if use string value

Open alexlo03 opened this issue 1 year ago • 2 comments

Background

I have the autoscaler (v2.0.1) working in nonproduction. For Production I want to over-provision to start with, therefore I want to set a CPU threshold lower than the recommended threshold of 65% for a regional instance. I want to start at 40% (or maybe lower) and work up from there slowly.

Bug

Just setting a threshold on the existing metrics breaks in a bizarre way if you specify values as strings.

tf-poller-function high_priority_cpu = 0.583039742903243, threshold = 40, margin = 5, location = us-central1

tf-scaler-function high_priority_cpu=0.583039742903243%, BELOW the range [35%-100%] => suggesting to scale from 300 to 100 PROCESSING_UNITS.

Original Poller Configuration
[
  {
    "instanceId": "redacted",
    "maxSize": "1000",
    "minSize": "100",
    "projectId": "redacted",
    "scalerPubSubTopic": "projects/redacted/topics/scaler-topic",
    "scalingMethod": "LINEAR",
    "stateDatabase": {
      "databaseId": "spanner-autoscaler-state",
      "instanceId": "spanner-autoscaler-state",
      "name": "spanner"
    },
    "units": "PROCESSING_UNITS"
  }
]
Updated Poller Configuration
[
  {
    "instanceId": "redacted",
    "maxSize": "1000",
    "metrics": [
      {
        "name": "high_priority_cpu",
        "regional_threshold": "40"
      },
      {
        "name": "rolling_24_hr",
        "regional_threshold": "50"
      },
      {
        "name": "storage"
      }
    ],
    "minSize": "100",
    "projectId": "redacted",
    "scalerPubSubTopic": "projects/redacted/topics/scaler-topic",
    "scalingMethod": "LINEAR",
    "stateDatabase": {
      "databaseId": "spanner-autoscaler-state",
      "instanceId": "spanner-autoscaler-state",
      "name": "spanner"
    },
    "units": "PROCESSING_UNITS"
  }
]

JS math goes wat if you specify any strings:

Screenshot 2024-06-18 at 16 34 26

I have a patch to fix this I will make the PR tomorrow but submitting this issue now.

alexlo03 avatar Jun 18 '24 19:06 alexlo03

Looking into this more, there's two options:

A. Hack solution: convert strings to numbers in getRange() B. Better solution: parse/validate on JSON load

A is terrible but B is bigger than I can do right now.

alexlo03 avatar Jun 19 '24 13:06 alexlo03

pr #338 adds schema validation of the config file which will check for things like this (and for mis-spelled properties etc) along with a command line validator.

This will be in v3 as it is potentially a breaking change for anyone who has specified numeric params as strings, or incorrectly spelled properties, as these will now be rejected as invalid configurations.

nielm avatar Jul 03 '24 15:07 nielm

I have tested unreleased V3 and this solves the problem. Thanks @nielm

alexlo03 avatar Sep 05 '24 17:09 alexlo03