solana icon indicating copy to clipboard operation
solana copied to clipboard

Improve Legibility and parsability of Logging

Open DaveWK opened this issue 2 years ago • 2 comments

Problem

By default the logs for solana-validator are a fire hose of metrics with a few critical bits of logging amongst the chatter. This makes it very hard to separate signal from noise when understanding issues. While there are ways to reduce this by setting up env vars related to rust logger, none of it is documented, and no best practices have been established.

Proposed Solution

There's a few approaches:

  1. Document the logging environment var and provide examples of best practices
  2. Separate the "Metrics" chatter into a separate log from other messages.
  3. Allow WARN/ERROR level messages to go to a separate file
  4. Document how to switch the logging format to JSON so it's easier to run parsers such as fluentbit, vector, or logstash over the files

DaveWK avatar Sep 03 '22 01:09 DaveWK

Ideally I would suggest option 4, because it allows for much more flexibility in parsing and shipping. This allows end users to use whatever monitoring and log aggregation technologies they are most familiar with rather than dictating a particular technology. It may already be possible to get JSON-formatted logs, but I do not see any place this is documented.

DaveWK avatar Sep 03 '22 01:09 DaveWK

This got discussed again in Discord somewhat recently, so a few notes:

  1. We just go off of the built-in RUST_LOG. This is pretty standard, but I guess we could link to some Rust documentation for those unfamiliar with Rust, as well as provide a warning about verbosity with lower levels.
  2. Running default config, metrics are logged to the local file AND submitted to the metrics db (assuming proper config of other metrics variables).
    • This is something that has been discussed / thought about a little. It is currently possible to disable local logging while continuing to submit metrics by setting RUST_LOG=solana=info,solana_metrics=warn. See https://github.com/solana-labs/solana/issues/15215 for more details.
  3. This is an interesting thought, altho, it is convenient to have all the logs in chronological order when trying to piece things together. I think I see your point tho in wanting a a file with only critical issues.
  4. No customization on format; the logging is just "plain-text" as you see it. Something like json would definitely be nice for ease of parsing, but getting some consistency across all the messages is probably first step. Once thing are consistent, a more sweeping change would be possible

steviez avatar Jan 23 '23 23:01 steviez