dbt-oracle icon indicating copy to clipboard operation
dbt-oracle copied to clipboard

[Bug] hash_collissions in dbt snapshot

Open espenhoh opened this issue 1 year ago • 1 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Current Behavior

snapshot is implemented with the ora_hash() function. This gives a lot of collissions in a large data set and duplicate ids, causing errors in the merge statement for data volumes of millions of rows. dbt snapshot will fail once a hash collission appears, and the snapshot tables becomes impossible to update.

Expected Behavior

No collissions,

Steps To Reproduce

run dbt snapshot command on data with a few million updates. First dbt snapshot command succeeds as there are only inserts, but duplicates in dbt_scd_id exists causing trouble on subsequest runs.

Relevant log output using --debug flag enabled

06:08:52    Database Error in snapshot table_name (snapshots\table_name.sql)
  ORA-30926: unable to get a stable set of rows in the source tables
  Help: https://docs.oracle.com/error-help/db/ora-30926/
  compiled Code at target\run\dbt_project\snapshots\table_name.sql

Environment

- OS: Windows
- Python: 3.11
- dbt: 1.8.1

What Oracle database version are you using dbt with?

19c

Additional Context

No response

espenhoh avatar Sep 03 '24 11:09 espenhoh