cosmos-sdk
cosmos-sdk copied to clipboard
[Feature]: Being able to change log level on the fly
Summary
Would be really awesome to have the ability to change log-level on the running node without restarting it.
Problem Definition
Sometimes there's a need to get more verbose logging on a node without actually restarting it. A perfect example is a chain upgrade that has a really big migration (two examples: Neutron v2.0.0 upgrade and Gaia upcoming v15 upgrade to v0.47). Imagine if you're a node operator whose node is stuck on a migration phase, and you have no way to understand where the node's at, if the migration is going okay and what's its status. To understand it and investigate it further, you as a node owner can update log level in config.toml and restart the node, but doing it mid-migration would almost 100% cause it to end up with a corrupted state. Therefore, having a way to change the log level on the existing node would be really helpful in such cases to debug.
(Ideally there should be a way to reload node's config without restarting it, but this can be really tricky to implement, updating log-level should be more easy than updating other params.)
Proposed Feature
Have a command in CLI or the API call or something that updates log-level on a node without having a restart it.
Thinking out loud here and just brainstorming...given how the logger is setup, this won't be trivial. You would have to call CreateSDKLogger and somehow funnel that change to the rest of the stack. The latter part will be tricky.
@alexanderbez won't zerolog.SetGlobalLevel() work here?
Does that API also affect individual concrete instances of a zerolog.Logger or just the global one?
@alexanderbez as far as I know it's not a call to a method of a specific Logger, but a call to a function within the library itself, and apparently it's designed to set the log level for all Loggers out there: https://github.com/rs/zerolog/blob/bd2896587dac510be79dcea54aaa94f14c4c5822/globals.go#L150
(I haven't tested it tho, but that's how I assume it should work)
@alexanderbez also brainstorming: the log-level change should be called not within the CLI subcommand, but within the node itself, and there should be some interface between the node and the CLI or whatever way there will be to change the log level, and doing it via the API call would pose a security threat, as imagine a node running smoothly, then somebody making an API call to switch it from info to trace, making it spam logs and increasing the disk reads/writes. So ideally it should be something that's accessible only from the node itself and something that can only be done by the node owner as in the person having SSH access to the node server, but I am unsure if we gave such ways of doing it, wdyt?
To be honest, I only see viable options:
A. Watch for node config file changes; i.e. read and reload the file and watch for changes, OR B. Simply restart the node with the new config change(s)
@alexanderbez B) won't work properly if it's done mid-migration as it'll end up corrupting the state (I had exactly this when restarting my Neutron node mid v2.0 migration)