dbt-duckdb
dbt-duckdb copied to clipboard
Dbt models that are materialized as views result in catalog error (table cannot be found)
Dbt models that are materialized as views are unable to be found:
Catalog Error: Table with name <table_name> does not exist
The sql view model seems to get created without issues:
Concurrency: 1 threads (target='duckdb') 1 of 4 OK created sql view model
Running a select statement on the model returns the missing table error. Wondering if this is a bug/missing feature or if I am missing something in the setup?
I see that there is a duckdb specific create_view_as macro. Does a corresponding view.sql need to be created under the materializations folder?
https://github.com/dbt-labs/dbt-core/blob/dev/louisa-may-alcott/core/dbt/include/global_project/macros/materializations/view/view.sql
hrm, that surprises me, we test that logic pretty thoroughly I just verified it works locally; can you post your duckdb
config from profiles.yml
and the version of dbt-duckdb you're using?
After a little more digging you are correct views do work. When I wrap my view in a CTE however it fails. From this thread it looks like duckdb does not support CTEs in views. https://github.com/duckdb/duckdb/issues/2479.
I don't understand- how do you wrap a view in a CTE? Isn't it dbt's job to turn the SELECT query into a view for you? I do stuff like:
WITH stg as (
SELECT * FROM {{ source('stg', 'table') }}
)
SELECT * FROM stg
all the time and it works fine.
Yes I'm doing the same:
WITH cte_expr AS (
SELECT
tablename.first_select,
tablename.second_select
FROM schemaname.tablename
)
SELECT * FROM cte_expr
When I add the tag: +materialized: table
then I have no issues.
When I change this to +materialized: view
or remove any materialization on the model then I run into the catalog error.
When I remove the CTE and use this:
SELECT
tablename.first_select,
tablename.second_select
FROM schemaname.tablename
I have no issues (materialized as a view or a table).
If you cannot reproduce this locally I will keep trying to see what I'm doing wrong on my end!
here's my config:
profiles.yml
duckdb:
type: duckdb
path: 'gpa.duckdb'
schema: "dev_{{ env_var('USER') }}"
version:
dbt-duckdb = "^1.3.2"
Yeah I can't get it to fail, here's what I'm running with:
An ingest.sql
model:
{{ config(materialized='view') }}
WITH input AS (
select *
from {{ source('csvs', 'test') }}
)
SELECT * FROM input
...and a profile that matches yours, and I see my data in dev_jwills.ingest
defined as a view.
To clarify: the dbt run
call succeeds, but when you go to open gpa.duckdb
using the DuckDB CLI (or an equivalent python script), the view is simply not present in dev_{{ env_var('USER') }}
?
What happens if you add a dbt test on the model in your schemas.yml?
Sorry I should have been more clear. Yes the dbt run call succeeds. I think the view successfully gets created as well (I can see the view in Dbeaver). When I try to query the view though I run into the catalog error. I'm just running a simple select * from dbtmodelname
. Is that an incorrect way of querying a view in duckdb?
I've tried from dbeaver, python client and a dbt test all with the same results.
oh interesting-- I always query SELECT * FROM <schema>.<relation>
; maybe that's the difference? And you're saying for some reason SELECT * FROM <relation>
works correctly unless the view in question was defined with a CTE?
(For relations that aren't defined in the main
schema, which is the default for DuckDB)
sorry again I am not being specific enough. I am specifying the schema name:
materialized as view
>>> con.execute("SELECT * FROM dev_asubramanian.tablename")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
duckdb.CatalogException: Catalog Error: Table with name tablename does not exist!
Did you mean "dev_asubramanian_1669766215.tablename"?
materialized as table
>>> con.execute("SELECT * FROM dev_asubramanian.tablename")
<duckdb.DuckDBPyConnection object at 0x1243cfcb0>
hrm, and I'm assuming "dev_asubramanian_1669766215" isn't a schema you created?
It is a schema I created (unique schema for testing). But it has the same issue.
### >>> con.execute("SELECT * FROM dev_asubramanian_1669766215.tablename")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
duckdb.CatalogException: Catalog Error: Table with name tablename does not exist!
Did you mean "dev_asubramanian_1669766215.tablename"?
well that is really weird. I almost wonder if there is like a character encoding issue somewhere-- for whatever reason, DuckDB doesn't seem to believe that the name of that view in the DB is the same as the name of the view you are querying for. 🤔
I am seeing issues when I try to open up the db with the duckdb cli:
./duckdb -init $HOME/analytics/dbt/gpa.duckdb
Error: near line 1: Parser Error: syntax error at or near ""
LINE 1: ??c????DUCK'?+????????
^
Error: near line 92: Parser Error: syntax error at or near "der"
LINE 1: der
^
Error: near line 104: Parser Error: syntax error at or near "der"
LINE 1: der
^
Error: near line 112: Parser Error: syntax error at or near "storder"
LINE 1: storder 16697614321432
^
Error: near line 123: Parser Error: syntax error at or near "der"
LINE 1: der
^
Error: near line 134: Parser Error: syntax error at or near "der"
LINE 1: der669688817
^
Error: near line 145: Parser Error: syntax error at or near "der"
LINE 1: der?^????^?? ?^?????^???^??^??...
^
Error: near line 493: Parser Error: syntax error at or near "der"
LINE 1: der
^
Error: near line 505: Parser Error: syntax error at or near "der"
LINE 1: der
^
Error: near line 513: Parser Error: syntax error at or near "storder"
LINE 1: storder 16697614321432
^
Error: near line 524: Parser Error: syntax error at or near "der"
LINE 1: der
^
Error: near line 535: Parser Error: syntax error at or near "der"
LINE 1: der669688817
^
Error: near line 546: Parser Error: syntax error at or near "der"
LINE 1: der?^????^?? ?^?????^???^??^??...
I think it’s having issues replaying the WAL which again might be an encoding-related problem
I am seeing issues when I try to open up the db with the duckdb cli:
./duckdb -init $HOME/analytics/dbt/gpa.duckdb
Error: near line 1: Parser Error: syntax error at or near "" LINE 1: ??c????DUCK'?+???????? ^ Error: near line 92: Parser Error: syntax error at or near "der" LINE 1: der ^ Error: near line 104: Parser Error: syntax error at or near "der" LINE 1: der ^ Error: near line 112: Parser Error: syntax error at or near "storder" LINE 1: storder 16697614321432 ^ Error: near line 123: Parser Error: syntax error at or near "der" LINE 1: der ^ Error: near line 134: Parser Error: syntax error at or near "der" LINE 1: der669688817 ^ Error: near line 145: Parser Error: syntax error at or near "der" LINE 1: der?^????^?? ?^?????^???^??^??... ^ Error: near line 493: Parser Error: syntax error at or near "der" LINE 1: der ^ Error: near line 505: Parser Error: syntax error at or near "der" LINE 1: der ^ Error: near line 513: Parser Error: syntax error at or near "storder" LINE 1: storder 16697614321432 ^ Error: near line 524: Parser Error: syntax error at or near "der" LINE 1: der ^ Error: near line 535: Parser Error: syntax error at or near "der" LINE 1: der669688817 ^ Error: near line 546: Parser Error: syntax error at or near "der" LINE 1: der?^????^?? ?^?????^???^??^??...
Is there a reason you're passing a duckdb database to a parameter that expects an SQL file?
just ignorance. How do I initialize an in-memory db from an existing duckdb file?
@akshaysu12 In your profiles.yml you can specify existing database in the target field with the relative path.
example:
target: dev
outputs:
dev:
type: duckdb
path: target/example.duckdb
Actually maybe you are asking how to do that on the cli. In that case you wouldn't need the --init flag.
./duckdb $HOME/analytics/dbt/gpa.duckdb
He's asking about the cli tho, which is just duckdb /path/to/the/existing/file.db
Thank you @Mause for spotting the -init
arg that I missed!
oh that is much simpler, thank you!