troubleshoot icon indicating copy to clipboard operation
troubleshoot copied to clipboard

Validation of input specs before running troubleshoot operations

Open banjoh opened this issue 3 years ago • 0 comments

Describe the rationale for the suggested feature.

Input specs (often YAML) passed in to troubleshoot CLIs & library functions can often be erroneous. We need to provide a way to validate this input before running any operations. Any validation errors need to be clear and user friendly. At the moment, input validation is handled by collector/analyser/redactor implementation at a very basic level.

Yaml validation is not in itself enough. Below is an example of a case that yaml validation won't catch.

Wrong indentations (Valid YAML, but invalid troubleshoot spec)

apiVersion: troubleshoot.sh/v1beta2
kind: HostPreflight
metadata:
  name: subnet-available
spec:
  collectors:
    # would output yes/no depending if there is a /22 available in 10.0.0.0/8
    - subnetAvailable:
      CIDRRangeAlloc: "10.0.0.0/8"
      desiredCIDR: 22
  analyzers:
    - subnetAvailable:
      outcomes:
        - fail:
          when: "no-subnet-available"
          message: failed to find available subnet
        - pass:
          when: "a-subnet-is-available"
          message: available /22 subnet found

... collector json object equivalent will be

{
    "subnetAvailable": null,
    "CIDRRangeAlloc": "10.0.0.0/8",
    "desiredCIDR": 22
}

Correct indentation

apiVersion: troubleshoot.sh/v1beta2
kind: HostPreflight
metadata:
  name: subnet-available
spec:
  collectors:
    # would output yes/no depending if there is a /22 available in 10.0.0.0/8
    - subnetAvailable:
        CIDRRangeAlloc: "10.0.0.0/0"
        desiredCIDR: 22
  analyzers:
    - subnetAvailable:
        outcomes:
          - fail:
              when: "no-subnet-available"
              message: failed to find available subnet
          - pass:
              when: "a-subnet-is-available"
              message: available /22 subnet found

... collector json object equivalent will be

{
    "subnetAvailable": {
        "CIDRRangeAlloc": "10.0.0.0/0",
        "desiredCIDR": 22
    }
}

Describe the feature

Below are two approaches this feature could take

  • Annotate structs with validation tags and ran inputs through a validation library such as this one to ensure they do not violate the required specification.
  • Utilise kubebuilder markers and validate input specs against troubleshoot generated CRD schemas. We need to
    • https://book.kubebuilder.io/reference/generating-crd.html#validation
    • https://book.kubebuilder.io/reference/markers/crd-validation.html
    • https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#validation
  • https://pkg.go.dev/k8s.io/kubectl/pkg/validation - the kubectl way
  • Create API enablers to allow projects using troubleshoot as a library to enable/disable validation, consume validation errors in code etc
  • Add capability to provide custom implementation to validate fields that cannot be annotated in OpenAPI syntax e.g validating regular expressions, CIDR blocks etc

Some use cases to validate This list can be very long but some few cases that we have noticed are

  • Checking required fields
  • Defining only one value of multiple possibilities e.g with the http collector we should only provide one verb in the spec
  • Minimums/maximums e.g no negative timeouts, maximum size of pod logs (maybe?)...
  • Validating CIDR blocks, regular expressions...
  • Ensure analysers always have a default outcome i.e one and only one outcome with no when condition. It must also be the last one (https://github.com/replicatedhq/troubleshoot/issues/1379#issuecomment-1836483682)
  • ...

banjoh avatar Nov 23 '22 18:11 banjoh