sensu-go
sensu-go copied to clipboard
[EPIC] Audit & improve sensu-backend logging
Description
Sensu backend log output can be cumbersome for some users – especially so when embedded etcd is enabled. We need to audit/review/update/move (to different log levels) various log output from the sensu-backend.
Goals
- Make log output more tunable by Sensu operators – ideally without restarting services
- Establish criteria for
ERROR
,WARN
,INFO
, andDEBUG
log levels- should
WARN
level log output be actionable? - should
ERROR
be reserved for matters concerning sensu platform reliability?
- should
- Establish criteria for log output that should become Sensu events
- what type of pipeline configuration errors should end-users know about?
- what logs could be "cluster events" (i.e. backend events published to the
sensu-system
namespace)? - what logs could be "namespace events" (i.e. backend events published to user-facing namespaces)?
Proposal
TBD
Spec
Coming soon.
Possible inspiration:
- [ ] #3809
- [ ] #4415
Issues
- [ ] sensu/sensu-enterprise-go#2161
- [ ] #4642
- [ ] #4538
- [ ] #4496
- [ ] #4494
- [ ] sensu/developer-advocacy#35
- [ ] sensu/sensu-enterprise-go#1871
- [ ] sensu/sensu-enterprise-go#1602
- [ ] sensu/sensu-enterprise-go#1173
- [ ] #2894
- [ ] sensu/sensu-enterprise-go#249
- [x] sensu/sensu-enterprise-go#248
- [ ] #4703
- [x] #4395
- [x] sensu/sensu-enterprise-go#2298
- [x] sensu/sensu-enterprise-go#2323
- [ ] #4842
- [x] sensu/sensu-enterprise-go#2270
- [x] sensu/sensu-enterprise-go#2453
- [x] sensu/sensu-enterprise-go#2475
- [x] sensu/sensu-enterprise-go#248
- [ ] #4842
Quick'n dirty thoughts:
Every time I need to debug a pipeline (and it's resources), I need to see the raw event input which forces me to use the debug log level (extremely verbose). When a cluster is under significant load, debug logging is too costly (negatively impacts backend performance), I am forced to create a separate pipeline with a cat
handler to get enough context. Many log events reference an event_id
, this is fine when the event has been persisted (has a check), however, there is no record (without debug logging) of events that only contain metrics.
Hey team! Please add your planning poker estimate with ZenHub @amdprophet @c-kruse @ccressent @echlebek
@fguimond I'm a little confused about what we are meant to be estimating here. Everything in the epic?
@c-kruse - Don't worry about it I must have selected it by accident.
Need to add https://github.com/sensu/sensu-go/issues/4842 to this as well
Hey, we completed an issue from this epic in 6.8.0! That's progress! 🙌
I will roll this EPIC to 6.9.0 now and aim to complete at least one more issue in that milestone.
@fguimond we need to add https://github.com/sensu/sensu-enterprise-go/issues/2453 to this. This issue was raised in a recent interaction.