loggerv Add ability to filter out spammy modules

Add ability to filter out spammy modules

Open clux opened this issue 6 years ago • 7 comments

If we include any simple http requests nowadays it's no longer just the slightly spammy hyper client, but also tons of crap from tokio and mio, which I'm not usually interested in.

env-logger supported this with RUST_LOG=hyper=warn,tokio=warn. Would be nice to have a similar thing.

Jan 29 '18 10:01 clux

Current workaround:

clitool -vv spammy-subcmd | grep -E "^clitool"
# or
clitool -vv spammy-subcmd | grep -vE "^tokio|^hyper"

Jan 29 '18 14:01 clux

I am starting to work with the tokio, futures, mio, etc. stack as well, and there are a lot of spammy statements that would be nice to filter out. I am definitely in favor of this feature addition.

Do you have an idea for the API? I have not looked at the env_logger crate, but I will propose the following, using the run_time_config.rs example as a template:

extern crate ansi_term;
#[macro_use] extern crate log;
extern crate loggerv;
extern crate clap;

use clap::{Arg, App};

fn main() {
    // ... Argument parsing 

    // It would be up to the user of this crate to decide the CLI arguments
    // that would map to a module name and level. Would the key-value pairs use
    // an equal sign? Would the option be named `log-filter` or just `filter`?
    // Answering these questions would be left to the application developer.

    // ... Parse log filter arguments into module name and level

    // Below, a static example is used to show that specific dependencies within an application can always be filtered.

    loggerv::Logger::new()
        .verbosity(args.occurrences_of("v"))
        .filter("hyper", WARN) // method signature = fn filter(&mut self, module_name: &str, level: Level) -> &mut Self
        .filter("tokio", WARN)
        .filters(&vec![("hyper", WARN), ("tokio", WARN)]) // Another method with a vector/array of module name, level tuples
        .level(args.is_present("level"))
        .line_numbers(args.is_present("debug"))
        .module_path(!args.is_present("no-module-path"))
        .colors(!args.is_present("no-color"))
        .init()
        .unwrap();

    error!("This is always printed to stderr");
    warn!("This too is always printed to stderr");
    info!("This is optionally printed to stdout based on the verbosity");
    debug!("This is optionally printed to stdout based on the verbosity");
    trace!("This is optionally printed to stdout based on the verbosity");
}

Thoughts?

Feb 02 '18 18:02 volks73

I think that's a nice API. Unfortunately, it's probably too static. If we actually needed the hyper logs to see what was going on because some reqwest wrapping API library wasn't passing down the http error properly (literally today), then we'd need to recompile the app to get those logs which isn't great.

I think maybe we should allow RUST_LOG=module1=level1,module2=level2 style overrides to also be parsed and take precedence over filter("module1", "defaultlevel1").

This would also provide some sane expectations to people used to env_logger, which is pretty much the most downloaded simple logger out there. Then in a sense, we'd be like a superset of that crate, but with possibly better tuned defaults :-).

I.e. with the combined suggestions:

# verbose output from our own tool but not tokio/hyper:
clitool -vv spammy-subcmd

# verbose from everything except tokio:
RUST_LOG=hyper=trace clitool -vv spammy-cubcmd

Feb 02 '18 21:02 clux

Hmm, I agree with supporting the RUST_LOG environment variable for consistency with the env_logger crate, but perhaps that should be a separate issue/feature request? I see the RUST_LOG support using the same filter implementation behind the scenes, so once the filter and filters methods have been added, then the RUST_LOG support becomes kind of trivial. I wonder about of order of operations. Does the RUST_LOG value always override any CLI values? In other words, what happens in the following situation:

$ RUST_LOG=hyper=warn clitool -vvv hyper=trace

$ RUST_LOG=hyper=trace clitool -vvv hyper=warn

My experience indicates the value closest to the command to have precedence, i.e. CLI values would override shell variables, shell variables override user environment variables, and user environment variables override system environment variables, etc. An additional override method would probably be needed if the intention is to have the RUST_LOG value override CLI values. I can see deployments having different values for the environment variable but then a user needing to brief change the filters for debugging within the environment using the CLI.

Here are some more flushed out examples that would allow things to be more dynamic and even provide a basic implementation of the RUST_LOG environment variable using only the filter and filters methods for the API. Note, the clap crate and iterators are doing most of the "heavy lifting", so all we really have to implement are the two methods, and please forgive the lack of sufficient error handling and user input validation/sanitation.

extern crate ansi_term;
#[macro_use] extern crate log;
extern crate loggerv;
extern crate clap;

use clap::{Arg, App};

fn main() {
    let args = App::new("clitool")
       .arg(Arg::with_name("v")
            .short("v")
            .multiple(true)
            .help("Sets the level of verbosity"))
       .arg(Arg::with_name("debug")
            .short("d")
            .long("debug")
            .help("Adds the line numbers to log statements"))
       .arg(Arg::with_name("no-module-path")
            .long("no-module-path")
            .help("Disables the module path in the log statements"))
       .arg(Arg::with_name("no-color")
            .long("no-color")
            .help("Disables colorized output"))
       .arg(Arg::with_name("log-filter")
            .long("log-filter")
            .short("f")
            .help("Adds a filter to the log statements by module name.")
            .takes_value(true)
            .multiple(true)) // <-- Automatically enables comma delimited multiple values
       .arg(Arg::with_name("level")
            .short("l")
            .long("level")
            .help("Adds the log level to the log statements. This will also surround the module path in square brackets."))
       .get_matches();

    // Log statements are filtered dynamically based on the key-value pairs passed to the `log-filter` argument.
    let filters: Vec<(&str, &str)> = if let Some(log_filters) = args.values_of("log-filter") {
        log_filters.fold(Vec::new(), |mut filters, arg_value| {
            filters.append(&mut arg_value.split(",").collect());
            filters
        }).iter().map(|kv| {
            let mut pair_iter = kv.split("=");
            let module_name = pair_iter.next().unwrap();
            let level = pair_iter.next().unwrap(); 
            (module_name, level)
        }).collect()
    } else {
        Vec::new()
    };

    loggerv::Logger::new()
        .verbosity(args.occurrences_of("v"))
        .filters(&filters)
        .level(args.is_present("level"))
        .line_numbers(args.is_present("debug"))
        .module_path(!args.is_present("no-module-path"))
        .colors(!args.is_present("no-color"))
        .init()
        .unwrap();

    error!("This is always printed to stderr");
    warn!("This too is always printed to stderr");
    info!("This is optionally printed to stdout based on the verbosity");
    debug!("This is optionally printed to stdout based on the verbosity");
    trace!("This is optionally printed to stdout based on the verbosity");
}

This would enable the following example commands:

clitool -vvv --log-filter=hyper=warn,tokio=warn
clitool -vvv --log-filter=hyper=warn --log-filter=tokio=warn
clitool -vvv -f hyper=warn,tokio=warn
clitool -vvv -f hyper=warn -f tokio=warn

Here would be an example of using the RUST_LOG environment variable to override the command line arguments without directly incorporating support, just need to implement the filter and filters methods.

extern crate ansi_term;
#[macro_use] extern crate log;
extern crate loggerv;
extern crate clap;

use clap::{Arg, App};
use std::env;

fn main() {
    let args = App::new("clitool")
       .arg(Arg::with_name("v")
            .short("v")
            .multiple(true)
            .help("Sets the level of verbosity"))
       .arg(Arg::with_name("debug")
            .short("d")
            .long("debug")
            .help("Adds the line numbers to log statements"))
       .arg(Arg::with_name("no-module-path")
            .long("no-module-path")
            .help("Disables the module path in the log statements"))
       .arg(Arg::with_name("no-color")
            .long("no-color")
            .help("Disables colorized output"))
       .arg(Arg::with_name("log-filter")
            .long("log-filter")
            .short("f")
            .help("Adds a filter to the log statements by module name.")
            .takes_value(true)
            .multiple(true)) // <-- Automatically enables comma delimited multiple values
       .arg(Arg::with_name("level")
            .short("l")
            .long("level")
            .help("Adds the log level to the log statements. This will also surround the module path in square brackets."))
       .get_matches();

    // Obtain the CLI filters like in the previous example
    let cli_filters: Vec<(&str, &str)> = if let Some(log_filters) = args.values_of("log-filter") {
        log_filters.fold(Vec::new(), |mut filters, arg_value| {
            filters.append(&mut arg_value.split(",").collect());
            filters
        }).iter().map(|kv| {
            let mut pair_iter = kv.split("=");
            let module_name = pair_iter.next().unwrap();
            let level = pair_iter.next().unwrap(); 
            (module_name, level)
        }).collect()
    } else {
        Vec::new()
    };

    // Then, parse the RUST_LOG environment variable using a similar process to above for the CLI arguments
    let env_filters: Vec<(&str, &str)> = if let Ok(env_value) = env::var("RUST_LOG") {
            env_value.split(",").collect().iter().map(|kv| {
                let mut pair_iter = kv.split("=");
                let module_name = pair_iter.next().unwrap();
                let level = pair_iter.next().unwrap(); 
                (module_name, level)
            }).collect()
        }
    } else {
        Vec::new()
    };

    loggerv::Logger::new()
        .verbosity(args.occurrences_of("v"))
        .filters(&cli_filters)
        .filters(&env_filters) // By setting the filters with the env values second, the RUST_LOG environment variable-based filters will "override" the CLI filters. To have the CLI filters override the RUST_LOG filters, switch the order.
        .level(args.is_present("level"))
        .line_numbers(args.is_present("debug"))
        .module_path(!args.is_present("no-module-path"))
        .colors(!args.is_present("no-color"))
        .init()
        .unwrap();

    error!("This is always printed to stderr");
    warn!("This too is always printed to stderr");
    info!("This is optionally printed to stdout based on the verbosity");
    debug!("This is optionally printed to stdout based on the verbosity");
    trace!("This is optionally printed to stdout based on the verbosity");
}

The blocks of code that deal with environment variable and CLI parsing could be moved into this crate as functions for convenience.

Feb 02 '18 23:02 volks73

That's all very sensible. I hadn't considered that CLI's might expose the filter like that, in which case the precedence of the two does becomes a problem. I do agree with your reasoning that arguments passed to the app would override any environment variables. But loggerv should probably keep the handling of RUST_LOG internal (reading the environment variable at init time only to preserve the precedence of explicit arguments) - given that it's a log convention.

I would personally not bother with any of this style of parsing in my apps and just default to tweaking RUST_LOG at debug time, however, I might tweak the defaults for hyper and tokio in my apps. It feels a bit verbose to implement the full set of options myself, but with a filter method there's nothing stopping anyone from doing it if they'd wish to, and if RUST_LOG was read internally, the behaviour would be the same ~~due~~.

Does this seem like a sane compromise?

Feb 03 '18 22:02 clux

Does this seem like a sane compromise?

Absolutely. I am all in favor adding internal support for the RUST_LOG environment variable, which would make loggerv a superset of env_logger. Should we have support for RUST_LOG be a separate issue and keep this one for tracking/following implementation of the filter and filters methods?

Feb 04 '18 00:02 volks73

Nice. I'm happy with having it in here, but feel free to raise another issue if you want to work on one of the ones. Though looks like you've already done the evar parsing in the above code anyway :-)

Feb 04 '18 00:02 clux

loggerv loggerv copied to clipboard

Add ability to filter out spammy modules

loggerv
loggerv copied to clipboard