HotShot
HotShot copied to clipboard
[MACRO] - Clean up logging
What is this task and why do we need to work on it?
Our logs are currently cluttered, which makes them hard to read. We need to bring amount of logs emitted to more reasonable levels.
What work will need to be done to complete this task?
Check our log statements and demote/remove/consolidate them where appropriate.
- [x] #2296
Are there any other details to include?
No response
What are the acceptance criteria to close this issue?
Logs aren't cluttered.
Branch work will be merged to (if not the default branch)
No response
[Pasting from Zulip.]
Salman:
- error! should really only be for things we actually don't expect to happen (but don't want to panic for)
- a lot of our error! logs should probably be info! at most
Jeb:
- debug for things that happen regularly and routinely, like every view, every decide, etc
- info for things that only happen in certain cases, aren't a problem themselves but may be triggered by and help debug a problem. For example, in a code path that is only used in some sort of recovery mode (like in the query service when we go to fetch missing data) you can use info much more liberlly
- warn for things that the system should be able to handle but still indicate a degraded condition that we may want a human to look into, e.g. network messages being lost more frequently than usual
- error for things that might leave the system in a permanently degraded state without manual intervention, e.g. "block not available at decide" (before we implemented fetching recovery)
Keyao:
- It's more important to distinguish warn/error vs. info/debug.
Note: debug isn't supported in Datadog, so we should choose info for non-concerning things that we want to display in Datadog. (Relevant Zulip discussion.)
Note: Now info actually isn't defaultly supported in Datadog. But this log level is something we could change easily in dockerfile.
Salman once mentioned:
If we fix the log level in a specific crate we could filter info just for that crate/module, say like RUST_LOG=warn,hotshot_orchestrator::config=info.
So don't worry about what Datadog could support, as long as we fix the log level, datadog filter could be changed easily.