sensu-go
sensu-go copied to clipboard
DISCUSS: core/v2.Pipeline debugging & error handling
Feature Suggestion
The new core/v2.Pipeline resource introduced in Sensu Go version 6.5 was effectively an improved implementation of set handlers. Now that we have a first-class resource for configuring pipelines (as opposed to further overloading the handler resource), we have a new opportunity to address a long outstanding papercut in Sensu: limited visibility into ~Handler~ Pipeline workflow executions.
Potential Pipeline workflow states that can be difficult to troubleshoot at times:
- Handler execution was filtered (filter executed with exit status 0). Recent improvements in the web app have helped address this, but would some optional debug output be helpful here?
- Filter error (filter executed with non-zero exit status), resulting in failed Pipeline workflow execution
- Mutator error (mutator executed with non-zero exit status), resulting in failed Pipeline workflow execution
- Handler error (handler executed with non-zero exit status), resulting in failed Pipeline workflow execution
- Handler executed successfully (handler execution was not filtered, and no mutator was applied, or a mutator was applied and executed successfully). This is the desired state/outcome, and not actually an error, however depending on the handler type successful handler execution can be difficult to verify (e.g. sending malformed or incorrectly tagged data to a data platform, making it difficult to query/etc). Would some debug output be helpful here?
NOTE: ALL of this information is available in sensu-backend logs, but many Sensu users operate Sensu as a service, meaning many users don't have access to Sensu logs. Logs are also not an ideal UX for user feedback. We need a first-class solution to present these results in the product.
Possible Implementation
- Output Sensu events (in a reserved namespace?) for certain handler execution states
- Use a new
core/v2.Pipelines.Spec.output
configuration scope for enabling "pipeline events"? Enabling handler execution results as Sensu events could be a useful building block for debugging purposes, or even for fall-back alerting (get notified if a critical handler execution fails).
🚧 this is a work in progress // more coming soon 🚧
TODO:
- [ ] There's plenty of prior art here — some of which are GitHub issues — let's link those here
- [ ] Draft a proposal
Related: https://github.com/sensu/sensu-enterprise-go/issues/1918
@asachs01 called out another pipeline workflow state in sensu/sensu-enterprise-go#2421 — asset download failures.
Since filters, mutators, and handlers all support assets, there are at least three distinct asset download failure states to consider.