troubleshoot
troubleshoot copied to clipboard
Deduplication for Cluster Resources Collector
Problem to solve
With the recently added feature to concatenate multiple specs together at runtime, we need to ensure that we are not unnecessarily running the Cluster Resources collectors more than it should be. We need to implement deduplication for this collector so that if it's specified more than once with the same options, that it only runs once.
User Impact
- Troubleshoot may take longer to run due to a redundant operation
Potential solution
Deduplicate any instances of the Cluster Resources collector in the final concatenated spec before running.
Example A:
apiVersion: troubleshoot.sh/v1beta2
kind: SupportBundle
metadata:
name: sample
spec:
collectors:
- clusterResources: {}
- clusterResources: {}
End result should be
apiVersion: troubleshoot.sh/v1beta2
kind: SupportBundle
metadata:
name: sample
spec:
collectors:
- clusterResources: {}
We should default to collecting more not less
Example B:
apiVersion: troubleshoot.sh/v1beta2
kind: SupportBundle
metadata:
name: sample
spec:
collectors:
- clusterResources: {}
- clusterResources:
namespaces:
- myapp-namespace
End result should be
apiVersion: troubleshoot.sh/v1beta2
kind: SupportBundle
metadata:
name: sample
spec:
collectors:
- clusterResources: {}
Given two unique collectors, use them both
Example C:
apiVersion: troubleshoot.sh/v1beta2
kind: SupportBundle
metadata:
name: sample
spec:
collectors:
- clusterResources:
namespaces:
- some-namespace
- clusterResources:
namespaces:
- myapp-namespace
End result should be
apiVersion: troubleshoot.sh/v1beta2
kind: SupportBundle
metadata:
name: sample
spec:
collectors:
- clusterResources:
namespaces:
- some-namespace
- clusterResources:
namespaces:
- myapp-namespace
Definition of done
- [ ] Deduplication for the cluster resources collector is added in its
Mergefunction - [ ] Unit tests
@xavpaice happy to get your thoughts here :)
I really like that you included examples of what inputs should look like when merged. Those would make really good test cases to confirm this is implemented correctly.
This seems like a good first run at deduplication since this collector probably stands to take the most time to run and preventing duplicate collections could save significant time.