dbt-oracle
dbt-oracle copied to clipboard
[Bug] Oracle run_result.json has rows_affected = 0 always when data are inserted or updated into target table
Is there an existing issue for this?
- [X] I have searched the existing issues
Current Behavior
run_results.json has rows_affected = 0 always even when there are data are inserted into target table or updated in the target table.
{
"metadata": {
"dbt_schema_version": "https://schemas.getdbt.com/dbt/run-results/v4.json",
"dbt_version": "1.4.99",
"generated_at": "2024-02-21T17:54:33.596062Z",
"invocation_id": "5ca55e12-7cc4-43be-8ef2-754c4fb1fc0c",
"env": {
}
},
"results": [
{
"status": "success",
"timing": [
{
"name": "compile",
"started_at": "2024-02-21T17:54:31.748147Z",
"completed_at": "2024-02-21T17:54:31.765072Z"
},
{
"name": "execute",
"started_at": "2024-02-21T17:54:31.801492Z",
"completed_at": "2024-02-21T17:54:33.039715Z"
}
],
"thread_id": "Thread-2 (worker)",
"execution_time": 1.3536255359649658,
"adapter_response": {
"_message": "OK",
"rows_affected": 0
},
"message": "OK",
"failures": null,
"unique_id": "model.DBT_RUN_MATRIX_TEST_2_0.RUN_MATRIX_TGT_3"
},
{
"status": "success",
"timing": [
{
"name": "compile",
"started_at": "2024-02-21T17:54:31.742269Z",
"completed_at": "2024-02-21T17:54:31.763914Z"
},
{
"name": "execute",
"started_at": "2024-02-21T17:54:31.771594Z",
"completed_at": "2024-02-21T17:54:33.061006Z"
}
],
"thread_id": "Thread-1 (worker)",
"execution_time": 1.3784840106964111,
"adapter_response": {
"_message": "OK",
"rows_affected": 0
},
"message": "OK",
"failures": null,
"unique_id": "model.DBT_RUN_MATRIX_TEST_2_0.RUN_MATRIX_TGT_1"
}
],
"elapsed_time": 4.044718503952026,
"args": {
}
}
Expected Behavior
Expected run_results.json have
- code to indicate insert, update, or delete
- corresponding rows_affected reflects the number of rows are either deleted, inserted, or updated.
Steps To Reproduce
Create a flow with Oracle as source and target, Run the flow to get the run_results.json
Relevant log output using --debug flag enabled
No response
Environment
- OS:
- Python:
- dbt:1.4.9
What Oracle database version are you using dbt with?
No response
Additional Context
No response
@cindyzh-ibm
Which python driver are you using ?
is it cx_Oracle or the new oracledb python driver.
dbt-oracle uses the python driver's cursor.rowcount attribute to populate rows_affected in run_results.json.
Please switch to the new driver if you have not done. To switch to the thin driver, you can set the following environment variable before running dbt
export ORA_PYTHON_DRIVER_TYPE=thin
This should populate rows_affected correctly.
Thanks, @aosingh , Let me check and get back to you
Hi @aosingh we are using oracledb python driver. The default is cx_oracle which is deprecating, so we are using oracledb and we do using thin mode. Following is the piece from our log.
15 2/21/2024 09:59:32 INFO IIS-DSEE-TODT-00002 <dbt,0> 17:59:26 oracle adapter: Running in thin mode
16 2/21/2024 09:59:32 INFO IIS-DSEE-TODT-00002 <dbt,0> 17:59:26 Running with dbt=1.4.99
This is our command
dbt
-command 'ORA_PYTHON_DRIVER_TYPE="thin" dbt run --vars \'{....}\' --profiles-dir "..."
;
Thank you @cindyzh-ibm.
What is the version of oracledb driver ?
python3 -c "import oracledb; print(oracledb.__version__)"
Also, please share the output of command
dbt --version
I have been trying to reproduce the problem with dbt-oracle==1.4.4 which is the latest version published in 1.4.x release line. Are you using incremental models ?
I tried a few simple cases which seem to work fine.
dbt seed- Insertion works as expected
"results": [
{
"status": "success",
"timing": [
{
"name": "compile",
"started_at": "2024-02-23T01:26:39.822399Z",
"completed_at": "2024-02-23T01:26:39.822408Z"
},
{
"name": "execute",
"started_at": "2024-02-23T01:26:39.824222Z",
"completed_at": "2024-02-23T01:26:40.628122Z"
}
],
"thread_id": "Thread-1",
"execution_time": 0.8198621273040771,
"adapter_response": {
"_message": "INSERT 5",
"code": "INSERT",
"rows_affected": 5
},
"message": "INSERT 5",
"failures": null,
"unique_id": "seed.dbt_adbs_test_project.seed"
}
],
- Table materialization - Creation of Table also shows the rows affected
"results": [
{
"status": "success",
"timing": [
{
"name": "compile",
"started_at": "2024-02-23T01:28:30.549139Z",
"completed_at": "2024-02-23T01:28:30.555564Z"
},
{
"name": "execute",
"started_at": "2024-02-23T01:28:30.556883Z",
"completed_at": "2024-02-23T01:28:31.258269Z"
}
],
"thread_id": "Thread-1",
"execution_time": 0.730828046798706,
"adapter_response": {
"_message": "OK",
"rows_affected": 5
},
"message": "OK",
"failures": null,
"unique_id": "model.dbt_adbs_test_project.people"
}
],
cc: @anthony-tuininga
@aosingh
sh-4.4$ python3 -c "import oracledb; print(oracledb.__version__)"
1.4.1
sh-4.4$ dbt --version
03:06:30 oracle adapter: Running in cx mode
03:06:30 oracle adapter: [WARNING]: cx_oracle is no longer maintained, use python-oracledb
To switch to python-oracledb set the environment variable ORA_PYTHON_DRIVER_TYPE=thin
Read the guideline here: https://docs.getdbt.com/reference/warehouse-setups/oracle-setup#configure-the-python-driver-mode
Documentation for python-oracledb can be found here: https://oracle.github.io/python-oracledb/
Core:
- installed: 1.4.99
- latest: 1.7.8 - Update available!
Your version of dbt-core is out of date!
You can find instructions for upgrading here:
https://docs.getdbt.com/docs/installation
Plugins:
- bigquery: 1.4.1 - Update available!
- oracle: 1.4.8 - Update available!
- postgres: 1.4.99 - Update available!
- redshift: 1.4.1 - Update available!
- snowflake: 1.4.5 - Update available!
- teradata: 1.4.99 - Update available!
At least one plugin is out of date or incompatible with dbt-core.
You can find instructions for upgrading here:
https://docs.getdbt.com/docs/installation
For your question of "Are you using incremental models ?". No
{{
config(
materialized='datastage'
, table_action='append'
....
)
}}
@cindyzh-ibm ismaterialized='datastage' a custom materialization which you have implemented ? dbt doesn't implement "datastage" materialization.
Could you verify "target" relation is returned in the end ?
-- Return the relations created in this materialization
{{ return({'relations': [target_relation]}) }}
@aosingh Yes, we use custom materialization.
No, we don't have "target" relation is returned in the end. We prepare statement and run run_query macro which delegates execution to dbt core by means of specific adapter
{{
config(
materialized='datastage'
, write_mode='update'
, key_columns=['ID']
, force_commit='false'
, use_adapter_drop_statement='false'
)
}}
{{
config(
materialized='datastage'
, table_action='append'
, write_mode='insert'
, force_commit='false'
, use_adapter_drop_statement='false'
)
}}
@cindyzh-ibm what is this statement ? is it an anonymous PL/SQL block or CTAS ? Details will help me reproduce the problem.
@aosingh custom materialization prepares and submits sql statements
@cindyzh-ibm
I understand these are SQL statements. What is the last statement executed in your custom materialization?
dbt-oracle uses cursor.rowcount to get the number of rows affected
The documentation states that cursor.rowcount is a read-only attribute which specifies the number of rows that have currently been fetched from the cursor (for select statements) or that have been affected by the operation (for insert, update, delete and merge statements). For all other statements the value is always zero.
Immediately after the cursor.execute() call has been made, if no rows have been fetched, the result will be 0. If you call cursor.fetchone() then result will be 1 and if you call cursor.fetchmany(5) then the result will be 6, and so forth
Did you try returning the target relation in the end ?
-- Return the relations created in this materialization
{{ return({'relations': [target_relation]}) }}
@aosingh ok. This is insert statement
BEGIN
EXECUTE IMMEDIATE '
insert into "test"."RUN_MATRIX_TGT_1" ("ID", "NAME", "HIRE_DATE", "SALARY")
select "ID", "NAME", "HIRE_DATE", "SALARY"
from "test"."test__dbt_tmp"
';
END;
Update statement
BEGIN
EXECUTE IMMEDIATE '
update "test"."RUN_MATRIX_TGT_3"
set
("ID", "NAME", "HIRE_DATE", "SALARY")
= ( select "ID", "NAME", "HIRE_DATE", "SALARY"
from "test"."test__dbt_tmp"
where
"test"."test__dbt_tmp"."ID" = "test"."test"."ID"
)
';
END;
I haven't tried following yet. So far we don't have "target" relation is returned in the end.
-- Return the relations created in this materialization
{{ return({'relations': [target_relation]}) }}
@cindyzh-ibm
Thanks, you are executing a PL/SQL(BEGIN..END) block in the custom materialization.
The value for cursor.rowcount will be 0 after executing a PL/SQL block. Let me check if we can use sql%rowcount expression immediately to get the row affected and pass that to dbt.