troubleshoot
troubleshoot copied to clipboard
Prevent preflight fail
@marccampbell Hi Marc, after digging further in the code trying to fix issues #239 and #227, I found out that a flag called "collect-without-permissions", bound to viper was, I think, wrongly set to false.
It looks like this issue (not failing with RBAC permissions) was addressed before, and this flag was aimed to solve the problem. Both support-bundle and preflight run this after RBAC checks:
//opts.IgnorePermissionErrors = v.GetBool("collect-without-permissions")
if foundForbidden && !opts.IgnorePermissionErrors {
collectResult.IsRBACAllowed = false
return collectResult, errors.New("insufficient permissions to run all collectors")
}
As the flag was set to false, the program stops if there is any permission issue. Downstream the programm seems fine, every collector failure is communicated to the user, and doesn't interrupts the next stages.
Closes #239 and Closes #227.
I don't know if this is right. The default of false
shows that there is an error.
The issue this is attempting to address is unrelated to this flag. If I have a preflight spec like:
apiVersion: troubleshoot.sh/v1beta2
kind: Preflight
metadata:
name: preflight-tutorial
spec:
analyzers:
- clusterVersion:
outcomes:
- fail:
when: "< 1.16.0"
message: The application requires at least Kubernetes 1.16.0, and recommends 1.18.0.
uri: https://kubernetes.io
- warn:
when: "< 1.18.0"
message: Your cluster meets the minimum version of Kubernetes, but we recommend you update to 1.18.0 or later.
uri: https://kubernetes.io
- pass:
message: Your cluster meets the recommended and required versions of Kubernetes.
But my RBAC permission doesn't have permission to list... Statefulsets, then I receive an error. But the collection of statefulsets is not needed for the analyzer specified, so we should not fail in this case.
Do you receive this error?
$kubectl preflight example_preflight.yml
Error: insufficient permissions to run all collectors
@manavellamnimble correct. that's the error
Yes, I think that's the reason. If you look the line in the link, before entering the collection phase, preflight (and also support bundle) checks if there was any permission error at all, and if the flag ignorePermissionErrors is set to false (indicating that no permission error should be allowed), if foundForbidden && !opts.IgnorePermissionErrors {
. Given that the flag is set to false, execution will always stop if there is any permission error (even if it doesn't concern your analyzers).
In line 71 you can see that RBAC errors are only reported to the progress channel, but execution is not stopped. Cluster resources would run even if there were permission errors.
https://github.com/replicatedhq/troubleshoot/blob/4149e4d6eddaa311e8940b534d4503e62a57f7ed/pkg/preflight/collect.go#L64
I was thinking, the flag's initial value depends on the desired default behavior, which I assumed would be to ignore rbac errors. If that were not the case, it should be set to true either by the user or using some logic.
@marccampbell Morning Marc, please tell me if you still get that error if you run:
kubectl preflight your_preflight_file.yml --collect-without-permissions=true
.
You should get a warning * cannot collect cluster-resources: action "list" is not allowed on resource "xxxxx" at the cluster scope
, but still get the results you expect.
Maybe I am missing something, but it seems that preflight is already programmed to ignore the failure of any analyzer/collector, inform the error and run the other analyzers/collectors in the spec. Or maybe I am misunderstanding the issue.
@manavellamnimble i can work around this issue, but the request here isn't to ignore warnings, but instead to only fail on warnings that come from collectors that are used be declared analyzers.
if i don't have an analyzer using a collector in a prefight, i'd rather not have to pass a flag to ignore warnings or change defaults.
@marccampbell Okey I think I got it now, I will figure out a way to solve this and make a new commit
@marccampbell Hi Marc, here is what I think is the cleaner approach to solve the problem, preventing as much as posible spreading modifications all over the code.
I wrote the conditional statements in isCollectorRequired
to make them more readable, but it may as well be a single long-conditional statement. Basically I am using the existing flag foundForbidden
to tell preflight if the RBAC error will affect the current run or not.
After running subject review, the code checks if the error happens in the cluster-resources collector (the one that generates conflict), and if that is so, if there is an analyzer that requires those permissions to make the analysis.
This modification does not affects support-bundle behavior.
Permissions to list Deployments and StatefulSets are not checked. Should I add them to the access review specs?
I almost forgot, I am not checking the verb because cluster-resources uses only "List", but if it would be possible to check it if it were necessary in future enhancements.
Hi @divolgin @marccampbell! Could you check this PR? There is not any hurry, but I didn't want it to get too old.
I don't see how this implements what #239 requests. A collector can fail for any reason. We shouldn't show any errors if there are no analyzers defined for that collector.
@divolgin currently preflight will fail on RBAC errors from any of the collectors added by default, wether they are required or not by the analyzers. For example, if your analyzers don't include CRD's but you don't have permissions to list CRDs, preflight fails. With this modification, as there is no CRD analyzer, preflight will not fail.