troubleshoot icon indicating copy to clipboard operation
troubleshoot copied to clipboard

Deduplication for Cluster Resources Collector

Open diamonwiggins opened this issue 3 years ago • 2 comments

Problem to solve

With the recently added feature to concatenate multiple specs together at runtime, we need to ensure that we are not unnecessarily running the Cluster Resources collectors more than it should be. We need to implement deduplication for this collector so that if it's specified more than once with the same options, that it only runs once.

User Impact

  • Troubleshoot may take longer to run due to a redundant operation

Potential solution

Deduplicate any instances of the Cluster Resources collector in the final concatenated spec before running.

Example A:

apiVersion: troubleshoot.sh/v1beta2
kind: SupportBundle
metadata:
  name: sample
spec:
  collectors:
    - clusterResources: {}
    - clusterResources: {}

End result should be

apiVersion: troubleshoot.sh/v1beta2
kind: SupportBundle
metadata:
  name: sample
spec:
  collectors:
    - clusterResources: {}

We should default to collecting more not less

Example B:

apiVersion: troubleshoot.sh/v1beta2
kind: SupportBundle
metadata:
  name: sample
spec:
  collectors:
    - clusterResources: {}
    - clusterResources:
        namespaces:
        - myapp-namespace

End result should be

apiVersion: troubleshoot.sh/v1beta2
kind: SupportBundle
metadata:
  name: sample
spec:
  collectors:
    - clusterResources: {}

Given two unique collectors, use them both

Example C:

apiVersion: troubleshoot.sh/v1beta2
kind: SupportBundle
metadata:
  name: sample
spec:
  collectors:
    - clusterResources:
        namespaces:
        - some-namespace
    - clusterResources:
        namespaces:
        - myapp-namespace

End result should be

apiVersion: troubleshoot.sh/v1beta2
kind: SupportBundle
metadata:
  name: sample
spec:
  collectors:
    - clusterResources:
        namespaces:
        - some-namespace
    - clusterResources:
        namespaces:
        - myapp-namespace

Definition of done

  • [ ] Deduplication for the cluster resources collector is added in its Merge function
  • [ ] Unit tests

diamonwiggins avatar Oct 03 '22 15:10 diamonwiggins

@xavpaice happy to get your thoughts here :)

diamonwiggins avatar Oct 05 '22 16:10 diamonwiggins

I really like that you included examples of what inputs should look like when merged. Those would make really good test cases to confirm this is implemented correctly.

This seems like a good first run at deduplication since this collector probably stands to take the most time to run and preventing duplicate collections could save significant time.

chris-sanders avatar Oct 11 '22 16:10 chris-sanders