soda-sql
soda-sql copied to clipboard
Columns not getting excluded Soda Snowflake
Although the columns are mentioned in the excluded_columns section they still appear in scan output.I can see them being included in the query, Measurements and tests are also getting executed.
Below is the sample yaml file
table_name: SAMPLEDATA
metrics:
- row_count
- missing_count
- missing_percentage
- values_count
- values_percentage
- valid_count
- valid_percentage
- invalid_count
- invalid_percentage
- min_length
- max_length
- avg_length
- min
- max
- avg
- sum
#- variance
#- stddev
excluded_columns:
- id
tests:
- row_count > 0
columns:
ID:
tests:
- max > 0
SELECT
COUNT(*),
COUNT(CASE WHEN NOT (ID IS NULL) THEN 1 END),
COUNT(CASE WHEN NOT (ID IS NULL) THEN 1 END),
MIN(ID),
MAX(ID),
AVG(ID),
SUM(ID)
FROM SAMPLEDATA
**QUERY Measurements:**
| Query measurement: values_count(ID) = 3206228
| Query measurement: valid_count(ID) = 3206228
| Query measurement: min(ID) = -2016166185
| Query measurement: max(ID) = 269703432
| Query measurement: avg(ID) = -328444222.192943
| Query measurement: sum(ID) = -1053067061633234
Test Execution: | Test column(ID) test(max > 0) passed with measurements {"expression_result": 269703432, "max": 269703432}
@jairamurs Sorry for the delay, In your YAML I see the id is excluded, but ID is still under columns and tests, are you sure that is the exact YAML?
excluded_columns:
- id
tests:
- row_count > 0
columns:
ID:
tests:
- max > 0
- ```
@vijaykiran the test was mentioned to show that test was getting executed even though the column was excluded. However even without the test mentioned for column level, the metrics are being calculated on table level. You can remove the test on ID column and you will still see the query measeurements being calculated for ID though its mentioned in excluded_columns