soda-core icon indicating copy to clipboard operation
soda-core copied to clipboard

Invalid user defined failed rows check configuration key "column" for failed rows check

Open rvo1994 opened this issue 5 months ago • 1 comments

Hi,

Based on the failed rows check documentation (which says it supports Soda Core), I should be able to pass a column configuration to a failed rows check:

checks for dim_product:
  # with SQL query
  - failed rows:
      name: Brand must be LUCKY DOG
      column: product_line
      fail query: |
        SELECT *
        FROM dim_product
        WHERE product_line LIKE '%LUCKY DOG%'
  # with CTE
  - failed rows:
      name: Brand must be LUCKY DOG
      column: product_line
      fail condition: brand LIKE '%LUCKY DOG%'

I tried it with the following check definition:

# Singular checks for gold.dim_mst_employee   
filter dim_mst_employee [is_current_filter]:
   where: is_current
 
checks for dim_mst_employee [is_current_filter]:
  - failed rows:
      name: DOB has to be at least 15 years ago
      column: birth_date
      fail condition: birth_date > current_date() - interval '15 years'
      attributes:
        check_type: Not allowed value

However, this throws the following error:

ERROR  | Invalid user defined failed rows check configuration key "column" | line=52,col=5 in 3. Gold Checks/mst_employee/dim_mst_employee.yml`

I am using the following version of soda core and spark adapter (Databricks):

soda-core==3.5.0
soda-core-spark==3.5.0

Is this a bug? Or isn't it supported by soda core even though the documentation says so? I would like the column to be available in the get_scan_result() output dictionary (e.g. scan_results['queries'][0]['column']) like it is for other checks.

Thanks!

rvo1994 avatar Jul 25 '25 13:07 rvo1994

CLOUD-9196

tools-soda avatar Jul 25 '25 13:07 tools-soda