dbt-impala
dbt-impala copied to clipboard
InvalidOperationException: new table <dbname>.<table> already exists
Hi,
We have a problem with drop,rename statements when using dbt-impala.
The firs dbt run command works nicely and the tables created. However, the second execution of dbt run command result like this:
$ dbt run
11:50:56 Running with dbt=1.0.4
11:50:57 Found 3 models, 0 tests, 0 snapshots, 0 analyses, 181 macros, 0 operations, 3 seed files, 0 sources, 0 exposures, 0 metrics
11:50:57
11:50:57 Concurrency: 1 threads (target='dev')
11:50:57
11:50:57 1 of 3 START table model testdb.stg_customers................................... [RUN]
11:50:58 1 of 3 ERROR creating table model testdb.stg_customers.......................... [ERROR in 0.97s]
11:50:58 2 of 3 START table model testdb.stg_orders...................................... [RUN]
11:50:58 2 of 3 ERROR creating table model testdb.stg_orders............................. [ERROR in 0.67s]
11:50:58 3 of 3 SKIP relation testdb.customers_dbt....................................... [SKIP]
11:50:58
11:50:58 Finished running 3 table models in 1.88s.
11:50:58
11:50:58 Completed with 2 errors and 0 warnings:
11:50:58
11:50:58 Runtime Error in model stg_customers (models/staging/stg_customers.sql)
11:50:58 Runtime Error
11:50:58 ImpalaRuntimeException: Error making 'alter_table' RPC to Hive Metastore:
11:50:58 CAUSED BY: InvalidOperationException: new table testdb.stg_customers already exists
11:50:58
11:50:58
11:50:58 Runtime Error in model stg_orders (models/staging/stg_orders.sql)
11:50:58 Runtime Error
11:50:58 ImpalaRuntimeException: Error making 'alter_table' RPC to Hive Metastore:
11:50:58 CAUSED BY: InvalidOperationException: new table testdb.stg_orders already exists
11:50:58
11:50:58
11:50:58 Done. PASS=0 WARN=0 ERROR=2 SKIP=1 TOTAL=3
11:50:59 Error sending message, disabling tracking
It says testdb.stg_orders and testdb.stg_orders tables are still there and not dropped. But when we query from Impala(HUE): We get this error:
AnalysisException: Could not resolve table reference: 'testdb.stg_customers'
And if we query from Hive(HUE) we see the tables not dropped. To solve this problem we have to execute invalidate metadata query and Impala start to see the tables. But still tables are not dropped so rename operation failed.
drop table if exists testdb.stg_orders
11:28:12.728425 [debug] [Thread-1 ]: SQL status: OK in 0.01 seconds
11:28:12.731421 [debug] [Thread-1 ]: Using impala connection "model.jaffle_shop.stg_orders"
11:28:12.731667 [debug] [Thread-1 ]: On model.jaffle_shop.stg_orders: /* {"app": "dbt", "dbt_version": "1.0.4", "profile_name": "jaffle_shop", "target_name": "dev", "node_id": "model.jaffle_shop.stg_orders"} */
alter table testdb.stg_orders__dbt_tmp rename to testdb.stg_orders
11:28:12.771044 [debug] [Thread-1 ]: finished collecting timing info
11:28:12.771395 [debug] [Thread-1 ]: On model.jaffle_shop.stg_orders: ROLLBACK
11:28:12.771598 [debug] [Thread-1 ]: Failed to rollback 'model.jaffle_shop.stg_orders'
11:28:12.771854 [debug] [Thread-1 ]: On model.jaffle_shop.stg_orders: Close
11:28:12.772400 [debug] [Thread-1 ]: Runtime Error in model stg_orders (models/staging/stg_orders.sql)
Runtime Error
ImpalaRuntimeException: Error making 'alter_table' RPC to Hive Metastore:
CAUSED BY: InvalidOperationException: new table testdb.stg_orders already exists
Is this error related with dbt-impala?
Thanks for helping.
Hey @bhctsntrk - can you let us know which version of the adapter you are using?
If you are on dbt-impala==1.0.4 can you give it a try dbt-impala==1.0.5rc2 - we included a fix for #16
It looks like you're using the jaffle-shop demo, where the staging models are materialised as views - I think you are likely running into #16 which is incorrectly handling those views as tables
Hello @sdairs , We updated to 1.0.5rc2 after encountering with #16 . For additional information our model structure is:
├── models
│ ├── customers_dbt.sql
│ └── staging
│ ├── stg_customers.sql
│ └── stg_orders.sql
'''customers_dbt.sql'''
with customers as (
select * from {{ ref('stg_customers') }}
),
orders as (
select * from {{ ref('stg_orders') }}
),
customer_orders as (
select
customer_id,
min(order_date) as first_order_date,
max(order_date) as most_recent_order_date,
count(order_id) as number_of_orders
from orders
group by 1
),
final as (
select
customers.customer_id,
customers.first_name,
customers.last_name,
customer_orders.first_order_date,
customer_orders.most_recent_order_date,
coalesce(customer_orders.number_of_orders, 0) as number_of_orders
from customers
left join customer_orders using (customer_id)
)
select * from final
'''staging/stg_orders.sql'''
select
id as order_id,
user_id as customer_id,
order_date,
status
from testdb.js_orders
'''staging/stg_customers.sql'''
select id as customer_id, first_name, last_name
from testdb.js_customers
'''dbt_project.yml'''
name: 'jaffle_shop'
config-version: 2
version: '0.1'
profile: 'jaffle_shop'
model-paths: ["models"]
seed-paths: ["seeds"]
test-paths: ["tests"]
analysis-paths: ["analysis"]
macro-paths: ["macros"]
target-path: "target"
clean-targets:
- "target"
- "dbt_modules"
- "logs"
require-dbt-version: [">=1.0.0", "<2.0.0"]
models:
jaffle_shop:
materialized: table
staging:
materialized: table
Thanks @bhctsntrk - will see if we can reproduce the error here
Hi @bhctsntrk - we merged #27 included in release 1.0.5 - can you give it a try and let us know if it helps with your issue at all? Thanks
@tovganesh I caught up with @bhctsntrk and it seems he is still facing the issue - we'll have to take another look at repro
@bhctsntrk let me know if you are still facing this issue, I am not able to reproduce the issue in the latest version 1.3.2.