soda-sql icon indicating copy to clipboard operation
soda-sql copied to clipboard

[Feature] Add a verbose option for `scan`

Open HansBambel opened this issue 3 years ago • 2 comments

Is your feature request related to a problem? Please describe. Having a descriptive output about what is happening is useful, but it would be nice to have an option that reduces the output to the major findings.

Describe the solution you'd like An option such as -verbose for the soda scan command would be nice to have. Running the command with for example verbose -1 would result in only output whether the tests passed or whether they failed and if so, which ones and why.

So basically removing:

  • Logging of "Executing SQL query"-commands
  • Logging of "Query measurement" lines
  • Logging of "Derived measurement" lines

Additional context One could assume that the user knows that when using this command he/she knows the information is limited.

Soda SQL Version:

HansBambel avatar Nov 02 '21 11:11 HansBambel

Thank you @HansBambel I think currently logs are too chatty :) Do you have any thoughts on verbose levels and what it should print ?

vijaykiran avatar Nov 02 '21 13:11 vijaykiran

I'd propose something along those lines:

verbose -1: Only output failed tests:

  | Scanning tables/scan.yml ...
  | 7 tests executed
  | 1 of 7 tests failed:
  |   Test column(my_column) test(2500 < avg) failed with measurements 2100.20
  | Errors occurred!
  |   [test_execution_error] Test "2500 < avg" failed
  | Exiting with code 1

verbose 0 (default): Passed and failed tests:

  | Scanning tables/report_damages/report.yml ...
  | No Soda Cloud account configured
  | Test sqlmetric(0) test(my_column1 > 0.0) passed with measurements {"expression_result": 0.141612164498951, "my_column1": 0.141612164498951}
  | Test test(row_count > 0) passed with measurements {"expression_result": 727494, "row_count": 727494}
  | Test column(my_column2) test(avg > 1000) passed with measurements {"expression_result": 1294.54153513079, "avg": 1294.54153513079}
  | Test column(my_column2) test(avg < 1600) passed with measurements {"expression_result": 1294.54153513079, "avg": 1294.54153513079}
  | Test column(my_column3) test(avg > 400) passed with measurements {"expression_result": 505.662826908393, "avg": 505.662826908393}
  | Test column(my_column3) test(avg < 400) failed with measurements {"expression_result": 505.662826908393, "avg": 505.662826908393}
  | Test column(my_column4) test(avg > 2500) passed with measurements {"expression_result": 2872.11595225701, "avg": 2872.11595225701}
  | Test column(my_column4) test(avg < 3100) passed with measurements {"expression_result": 2872.11595225701, "avg": 2872.11595225701}
  | 8 tests executed
  | 1 of 8 tests failed:
  |   Test column(my_column3) test(avg < 400) failed with measurements {"expression_result": 505.662826908393, "avg": 505.662826908393}
  | Exiting with code 1

verbose 1: Include the Query measurements and Derived measurements

  | Query measurement: row_count = 123456
  | Query measurement: values_count(id) = 123456
  | Query measurement: values_count(my_column1) = 122345
  | Query measurement: values_count(my_column2) = 87654
  | Query measurement: values_count(my_column3) = 87654
  | Query measurement: values_count(my_column4) = 87654
  | Derived measurement: missing_percentage(id) = 0.0
  | Derived measurement: missing_count(id) = 0
  | Derived measurement: values_percentage(id) = 100.0
  | Derived measurement: missing_percentage(my_column1) = 55.45
  | Derived measurement: missing_count(my_column1) = 55000
...
  | Test sqlmetric(0) test(my_column1 > 0.0) passed with measurements {"expression_result": 0.141612164498951, "my_column1": 0.141612164498951}
  | Test test(row_count > 0) passed with measurements {"expression_result": 727494, "row_count": 727494}
  | Test column(my_column2) test(avg > 1000) passed with measurements {"expression_result": 1294.54153513079, "avg": 1294.54153513079}
  | Test column(my_column2) test(avg < 1600) passed with measurements {"expression_result": 1294.54153513079, "avg": 1294.54153513079}
  | Test column(my_column3) test(avg > 400) passed with measurements {"expression_result": 505.662826908393, "avg": 505.662826908393}
  | Test column(my_column3) test(avg < 400) failed with measurements {"expression_result": 505.662826908393, "avg": 505.662826908393}
  | Test column(my_column4) test(avg > 2500) passed with measurements {"expression_result": 2872.11595225701, "avg": 2872.11595225701}
  | Test column(my_column4) test(avg < 3100) passed with measurements {"expression_result": 2872.11595225701, "avg": 2872.11595225701}
  | 8 tests executed
  | 1 of 8 tests failed:
  |   Test column(my_column3) test(avg < 400) failed with measurements {"expression_result": 505.662826908393, "avg": 505.662826908393}
  | Exiting with code 1

And finally verbose 2: including the SQL statements as well.

HansBambel avatar Nov 02 '21 13:11 HansBambel