dbt-snowflake icon indicating copy to clipboard operation
dbt-snowflake copied to clipboard

[Bug] dbt unit tests raise an error when `QUOTED_IDENTIFIERS_IGNORE_CASE = true` in Snowflake

Open ivanroquev opened this issue 8 months ago • 13 comments

Is this a new bug in dbt-core?

  • [X] I believe this is a new bug in dbt-core
  • [X] I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

No significative log output without log-level debug and the error is unexpected.

14:14:13 Finished running 4 data tests, 1 unit test, 1 project hook in 0 hours 0 minutes and 9.68 seconds (9.68s). 14:14:13 Command end result 14:14:13
14:14:13 Completed with 1 error and 0 warnings: 14:14:13
14:14:13 'actual_or_expected' 14:14:13
14:14:13 Done. PASS=4 WARN=0 ERROR=1 SKIP=0 TOTAL=5

Expected Behavior

Complete the test successfully like:

06:57:00 Finished running 4 data tests, 1 unit test, 1 project hook in 0 hours 0 minutes and 11.69 seconds (11.69s). 06:57:00
06:57:00 Completed successfully 06:57:00
06:57:00 Done. PASS=5 WARN=0 ERROR=0 SKIP=0 TOTAL=5

Steps To Reproduce

With this config:

models:
  - name: fact_chargers
    description: This table has basic information about charges
    columns:
      - name: country
        data_tests:
          - accepted_values:
              values: [
              'LU',
              'BE',
          ]
      - name: transaction_id
        data_tests:
          - unique
          - not_null
      - name: external_id
      - name: promotion_id
      - name: is_promotion_applied
        data_tests:
          - accepted_values:
              name: is_promotion_applied_accepted_values
              values: [ true, false ]


unit_tests:
  - name: test_fact_chargers_external_id
    description: "Unit test FACT_CHARGERS"
    model: fact_chargers
    overrides:
      macros:
        # unit test this model in "full refresh" mode
        is_incremental: true
    given:
      - input: source("DW", "fact_chargers")
        rows:
           - {
               transaction_id: transaction1,
               business_date: 2024-06-11,
               external_id: null,
               promotion_id: null,
               is_promotion_applied: false,
             }
           - {
              transaction_id: transaction2,
              business_date: 2024-06-11,
              external_id: null,
              promotion_id: null,
              is_promotion_applied: false,
             }
      - input: source("DW", "fact_promotions")
        rows:
          - {subtype: redemption, channel: emobility, external_id: ext-abcd, business_date: 2024-06-11, coupon_id: abcd1234, ticket_id: transaction1}
          - {subtype: redemption, channel: emobility, external_id: ext-addd, business_date: 2024-06-11, coupon_id: addd1235, ticket_id: transaction2}
          - {subtype: assignation, channel: emobility, external_id: ext-asig, business_date: 2024-06-11, coupon_id: addd1235, ticket_id: transaction1}
          - {subtype: redemption, channel: store, external_id: ext-store-redemption, business_date: 2024-06-11, coupon_id: addd1235, ticket_id: transaction2}
    expect:
      rows:
        - {
            transaction_id: transaction1,
            business_date: 2024-06-11,
            external_id: ext-abcd,
            promotion_id: abcd1234,
            is_promotion_applied: true,
          }
        - {
            transaction_id: transaction2,
            business_date: 2024-06-11,
            external_id: ext-addd,
            promotion_id: addd1235,
            is_promotion_applied: true,
          }

Running the test like that:

dbt test --select fact_chargers -t stg --vars '{"from_date_key": "2024-06-10", "to_date_key": "2024-06-11"}'

With this output:

+--------------+-------------+-----------+------------+--------------------+------------------+
|TRANSACTION_ID|BUSINESS_DATE|EXTERNAL_ID|PROMOTION_ID|IS_PROMOTION_APPLIED|ACTUAL_OR_EXPECTED|
+--------------+-------------+-----------+------------+--------------------+------------------+
|transaction1  |2024-06-11   |ext-abcd   |abcd1234    |true                |actual            |
|transaction2  |2024-06-11   |ext-addd   |addd1235    |true                |actual            |
|transaction1  |2024-06-11   |ext-abcd   |abcd1234    |true                |expected          |
|transaction2  |2024-06-11   |ext-addd   |addd1235    |true                |expected          |
+--------------+-------------+-----------+------------+--------------------+------------------+

Relevant log output

14:14:12  SQL status: SUCCESS 4 in 1.0 seconds
14:14:12  Applying DROP to: DEVELOPMENT.star_model.test_fact_chargers_external_id__dbt_tmp
14:14:12  Using snowflake connection "unit_test.bs.fact_chargers.test_fact_chargers_external_id"
14:14:12  On unit_test.bs.fact_chargers.test_fact_chargers_external_id: /* {"app": "dbt", "dbt_version": "1.8.2", "profile_name": "BS", "target_name": "stg", "node_id": "unit_test.bs.fact_chargers.test_fact_chargers_external_id"} */
drop table if exists DEVELOPMENT.star_model.test_fact_chargers_external_id__dbt_tmp cascade
14:14:12  SQL status: SUCCESS 1 in 0.0 seconds
 -------- SCHEMA ------------- 
| column               | data_type |
| -------------------- | --------- |
| TRANSACTION_ID       | Text      |
| BUSINESS_DATE        | Date      |
| EXTERNAL_ID          | Text      |
| PROMOTION_ID         | Text      |
| IS_PROMOTION_APPLIED | Boolean   |
| ACTUAL_OR_EXPECTED   | Text      |

actual
14:14:12  On unit_test.bs.fact_chargers.test_fact_chargers_external_id: Close
14:14:13  Unhandled error while executing 
'actual_or_expected'
14:14:13  Traceback (most recent call last):
  File "/Users/ivan.roque/Library/Python/3.9/lib/python/site-packages/dbt/task/base.py", line 368, in safe_run
    result = self.compile_and_execute(manifest, ctx)
  File "/Users/ivan.roque/Library/Python/3.9/lib/python/site-packages/dbt/task/base.py", line 314, in compile_and_execute
    result = self.run(ctx.node, manifest)
  File "/Users/ivan.roque/Library/Python/3.9/lib/python/site-packages/dbt/task/base.py", line 415, in run
    return self.execute(compiled_node, manifest)
  File "/Users/ivan.roque/Library/Python/3.9/lib/python/site-packages/dbt/task/test.py", line 264, in execute
    unit_test_node, unit_test_result = self.execute_unit_test(test, manifest)
  File "/Users/ivan.roque/Library/Python/3.9/lib/python/site-packages/dbt/task/test.py", line 237, in execute_unit_test
    actual = self._get_unit_test_agate_table(table, "actual")
  File "/Users/ivan.roque/Library/Python/3.9/lib/python/site-packages/dbt/task/test.py", line 339, in _get_unit_test_agate_table
    unit_test_table = result_table.where(
  File "/Users/ivan.roque/Library/Python/3.9/lib/python/site-packages/agate/table/where.py", line 21, in where
    if test(row):
  File "/Users/ivan.roque/Library/Python/3.9/lib/python/site-packages/dbt/task/test.py", line 340, in <lambda>
    lambda row: row["actual_or_expected"] == actual_or_expected
  File "/Users/ivan.roque/Library/Python/3.9/lib/python/site-packages/agate/mapped_sequence.py", line 88, in __getitem__
    return self.dict()[key]
KeyError: 'actual_or_expected'

14:14:13  5 of 5 ERROR fact_chargers::test_fact_chargers_external_id ..................... [ERROR in 3.62s]

Environment

- OS: MacOS
- Python: 3.9
- dbt: 1.8.2
- dbt-snowflake: 1.8.2

Which database adapter are you using with dbt?

snowflake

Additional Context

I've had to replace the _get_unit_test_agate_table method to parse all dict keys to lowercase like this:

def _get_unit_test_agate_table(self, result_table, actual_or_expected: str):
    import agate
    # Convertir todas las claves de cada fila a lowercase
    def lowercase_keys(row):
        return {k.lower(): v for k, v in row.items()}

    # Convertir todas las filas del result_table a lowercase
    lowercased_rows = [lowercase_keys(row) for row in result_table.rows]

    filtered_rows = [
        row for row in lowercased_rows if row.get("actual_or_expected") == actual_or_expected.lower()
    ]

    # Convertir las filas filtradas de nuevo a un objeto agate.Table
    unit_test_table = agate.Table.from_object(filtered_rows)

    columns = list(unit_test_table.columns.keys())
    columns.remove("actual_or_expected")

    return unit_test_table.select(columns)

The original method is:

def _get_unit_test_agate_table(self, result_table, actual_or_expected: str):
    unit_test_table = result_table.where(
        lambda row: row["actual_or_expected"].lower() == actual_or_expected
    )
    columns = list(unit_test_table.columns.keys())
    columns.remove("actual_or_expected")
    return unit_test_table.select(columns)

ivanroquev avatar Jun 12 '24 08:06 ivanroquev