Vijay Kiran
Vijay Kiran
Add support for checks that are based on the relative percentage difference eg ```yaml checks for CUSTOMERS: - change percent between avg last 7 and count < 10 % ```
Depends on https://github.com/sodadata/soda-core/issues/1243
See docs/soda_checks_yaml.md. L69 find the TODO Consider if we should push it to the user to define the right variables and avoid clashes between the variable names when comparing? So...
Implement an API to validate a scan. The goal is to log as much errors as possible without opening any warehouse connection.
with this checks: ```yaml checks for dim_customer: - invalid_count(marital_status) = 0: valid values: - W ``` Scan reports no errors: ```shell ➜ soda scan -c configuration.yml -d adventureworks checks1.yml Soda...
Relevant checks.yml ``` checks for actor: - count > 0 - column types: actor_id: integer first_name: varchar last_update: timestamp ``` ``` DEBUG | column types [FAILED] DEBUG | column_type_mismatch[last_update] expected(timestamp)...
Add CLI command for validating the connection and building the first checks. Eg `soda connect` will check all the configured datasources. ``` soda connect datasource redshift_xyz [OK] datasource postgres_dev [OK]...
All data sources have reserved keywords, if these are used for table/column names we have to quote them. Can we just quote everything to get around the issue?
Similar support for Hive as in Soda SQL https://docs.soda.io/soda/warehouse_types.html#apache-hive-experimental