dbt-spark
dbt-spark copied to clipboard
[ADAP-1019] [Bug] Table already exists, you need to drop it first in incremental models
Is this a new bug in dbt-spark?
- [X] I believe this is a new bug in dbt-spark
- [X] I have searched the existing issues, and I could not find an existing issue for this bug
Current Behavior
Whenever running a incremental model after a first run, dbt spark using hudi as file_format states that the table already exists and should be dropped.
Expected Behavior
Any run after the should proceed as the first
Steps To Reproduce
- Using dbt-spark=1.5.2
- Start a Kyuubi server with Hudi enabled
- Ran the sample model twice
{{
config(
materialized='incremental',
incremental_strategy='merge',
unique_key='prim_key',
file_format='hudi',
location_root=<s3-path>'
)
}}
select 1 as prim_key
Relevant log output
org.apache.kyuubi.KyuubiSQLException: org.apache.kyuubi.KyuubiSQLException: Error operating ExecuteStatement: org.apache.spark.sql.AnalysisException: Table teste_dbt_dw_spark.kyuubi_incremental_hudi already exists. You need to drop it first.
Environment
- OS: Ubuntu 20.04
- Python: 3.8.10
- dbt-core: 1.5.8
- dbt-spark:1.5.2
Additional Context
I'm running Kyuubi as I wasn't able to use thrift as per the docs on EMR.
Also followed some examples here, but didn't manage to get it working
It seems to be a problem when the adapter can't read all tables in the catalog. I had some Iceberg tables set in the same catalog and some errors were popping up regarding those.
After deleting the Iceberg tables, the adapter worked as expected
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.