troubleshoot icon indicating copy to clipboard operation
troubleshoot copied to clipboard

conditional NodeResourceAnalyzer

Open manavellamnimble opened this issue 5 years ago • 2 comments

@markpundsack @divolgin The idea is to make NodeResourceAnalyzer more flexible when working with allocatable resources. As stated in issue #210, in the case of updates the allocatable resources needed are not the same as in new installs. I added the possibility to check if a given deployment exists in a namespace, and to separate the resources needed whether it is a new install or an update. To this purpose I created three new fields, deployment, containing the name and namespace of the deployment, onInstall, where the filters and outcomes for new installs are provided, and onUpgrade, where in case the deployment already exists other filters and outcomes may be provided.

This does not changes the way the analyzer functioned originally. An example would be as follow:

- nodeResources:
        checkName: check allocatable resources for new installs or updates
        deployment:
          name: myapp
          namespace: default
        onInstall:
          filters:
            cpuAllocatable: "5"
            memoryAllocatable: 5Gi
          outcomes:
            - fail:
                when: "count() < 1"
                message: On new installs, this application requires at least 1 node with 5 allocatable cpus and 5Gb of allocatable memory.
                uri: https://kurl.sh/docs/install-with-kurl/adding-nodes
            - warn:
                when: "count() < 2"
                message: On new installs, this application requires at least 2 nodes with 5 allocatable cpus and 5Gb of allocatable memory.
                uri: https://kurl.sh/docs/install-with-kurl/adding-nodes
            - pass:
                message: This cluster has enough nodes
        onUpdate:
          filters:
            cpuAllocatable: "2"
            memoryAllocatable: 2Gi
          outcomes:
            - fail:
                when: "count() < 1"
                message: On updates, this application requires at least 1 node with 2 allocatable cpus and 2Gb of allocatable memory.
                uri: https://kurl.sh/docs/install-with-kurl/adding-nodes
            - warn:
                when: "count() < 2"
                message: This application recommends at last 2 nodes with 2 allocatable cpus and 2Gb of allocatable memory.
                uri: https://kurl.sh/docs/install-with-kurl/adding-nodes
            - pass:
                message: This cluster has enough nodes to update.

Fix #210

manavellamnimble avatar Oct 29 '20 19:10 manavellamnimble

@markpundsack @manavellamnimble I agree with the problem identified here, but I'm not sure I agree with the solution.

Troubleshoot doesn't know about "install" vs "upgrade" workflows, that's a KOTS detail. How would this be used in an OSS workflow, outside of KOTS?

I think we need to go back to the design here a little. Label selectors feel like the right approach, but the "onInstall" and "onUpgrade" feel to wrap some magic up.

The goal here should be to somehow collect node resources without the podspecs that match the label selector (or only the ones that do).

I think the more "k8s native" design is to have a match label selector in the analyzer spec. We should collect the node resources specified, but subtract out the podspecs that match the matchlabel selector to determine the availability on the node without the pods that match.

I'm not sure what I'm proposing is exactly right. But I'd like to have more discussion before this change is merged in.

marccampbell avatar Oct 29 '20 23:10 marccampbell

@marccampbell I thought something similar at first, using the spec.containers[].resources.requests[] field of the pods matching a certain label, but I wasn't sure if all the vendors reserved resources in this fashion. If you are ok with it, I will try this approach

manavellamnimble avatar Oct 30 '20 19:10 manavellamnimble