dbt-duckdb icon indicating copy to clipboard operation
dbt-duckdb copied to clipboard

Dbt models that are materialized as views result in catalog error (table cannot be found)

Open akshaysu12 opened this issue 2 years ago • 20 comments

Dbt models that are materialized as views are unable to be found:

Catalog Error: Table with name <table_name> does not exist

The sql view model seems to get created without issues:

Concurrency: 1 threads (target='duckdb') 1 of 4 OK created sql view model

Running a select statement on the model returns the missing table error. Wondering if this is a bug/missing feature or if I am missing something in the setup?

I see that there is a duckdb specific create_view_as macro. Does a corresponding view.sql need to be created under the materializations folder?

https://github.com/dbt-labs/dbt-core/blob/dev/louisa-may-alcott/core/dbt/include/global_project/macros/materializations/view/view.sql

akshaysu12 avatar Nov 29 '22 14:11 akshaysu12

hrm, that surprises me, we test that logic pretty thoroughly I just verified it works locally; can you post your duckdb config from profiles.yml and the version of dbt-duckdb you're using?

jwills avatar Nov 29 '22 17:11 jwills

After a little more digging you are correct views do work. When I wrap my view in a CTE however it fails. From this thread it looks like duckdb does not support CTEs in views. https://github.com/duckdb/duckdb/issues/2479.

akshaysu12 avatar Nov 29 '22 22:11 akshaysu12

I don't understand- how do you wrap a view in a CTE? Isn't it dbt's job to turn the SELECT query into a view for you? I do stuff like:

WITH stg as (
  SELECT * FROM {{ source('stg', 'table') }}
)
SELECT * FROM stg

all the time and it works fine.

jwills avatar Nov 29 '22 23:11 jwills

Yes I'm doing the same:

WITH cte_expr AS (
	SELECT 
		tablename.first_select,
		tablename.second_select
	FROM schemaname.tablename
)

SELECT * FROM cte_expr

When I add the tag: +materialized: table then I have no issues. When I change this to +materialized: view or remove any materialization on the model then I run into the catalog error.

When I remove the CTE and use this:

SELECT 
      tablename.first_select,
      tablename.second_select
FROM schemaname.tablename

I have no issues (materialized as a view or a table).

If you cannot reproduce this locally I will keep trying to see what I'm doing wrong on my end!

here's my config:

profiles.yml

    duckdb:
      type: duckdb
      path: 'gpa.duckdb'
      schema: "dev_{{ env_var('USER') }}"

version: dbt-duckdb = "^1.3.2"

akshaysu12 avatar Nov 30 '22 00:11 akshaysu12

Yeah I can't get it to fail, here's what I'm running with:

An ingest.sql model:

{{ config(materialized='view') }}

WITH input AS (
  select *
  from {{ source('csvs', 'test') }}
)
SELECT * FROM input

...and a profile that matches yours, and I see my data in dev_jwills.ingest defined as a view.

To clarify: the dbt run call succeeds, but when you go to open gpa.duckdb using the DuckDB CLI (or an equivalent python script), the view is simply not present in dev_{{ env_var('USER') }}?

What happens if you add a dbt test on the model in your schemas.yml?

jwills avatar Nov 30 '22 00:11 jwills

Sorry I should have been more clear. Yes the dbt run call succeeds. I think the view successfully gets created as well (I can see the view in Dbeaver). When I try to query the view though I run into the catalog error. I'm just running a simple select * from dbtmodelname. Is that an incorrect way of querying a view in duckdb?

I've tried from dbeaver, python client and a dbt test all with the same results.

akshaysu12 avatar Nov 30 '22 01:11 akshaysu12

oh interesting-- I always query SELECT * FROM <schema>.<relation>; maybe that's the difference? And you're saying for some reason SELECT * FROM <relation> works correctly unless the view in question was defined with a CTE?

jwills avatar Nov 30 '22 01:11 jwills

(For relations that aren't defined in the main schema, which is the default for DuckDB)

jwills avatar Nov 30 '22 01:11 jwills

sorry again I am not being specific enough. I am specifying the schema name:

materialized as view

>>> con.execute("SELECT * FROM dev_asubramanian.tablename")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
duckdb.CatalogException: Catalog Error: Table with name tablename does not exist!
Did you mean "dev_asubramanian_1669766215.tablename"?

materialized as table

>>> con.execute("SELECT * FROM dev_asubramanian.tablename")
<duckdb.DuckDBPyConnection object at 0x1243cfcb0>

akshaysu12 avatar Nov 30 '22 01:11 akshaysu12

hrm, and I'm assuming "dev_asubramanian_1669766215" isn't a schema you created?

jwills avatar Nov 30 '22 01:11 jwills

It is a schema I created (unique schema for testing). But it has the same issue.

### >>> con.execute("SELECT * FROM dev_asubramanian_1669766215.tablename")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
duckdb.CatalogException: Catalog Error: Table with name tablename does not exist!
Did you mean "dev_asubramanian_1669766215.tablename"?

akshaysu12 avatar Nov 30 '22 01:11 akshaysu12

well that is really weird. I almost wonder if there is like a character encoding issue somewhere-- for whatever reason, DuckDB doesn't seem to believe that the name of that view in the DB is the same as the name of the view you are querying for. 🤔

jwills avatar Nov 30 '22 01:11 jwills

I am seeing issues when I try to open up the db with the duckdb cli:

./duckdb -init $HOME/analytics/dbt/gpa.duckdb

Error: near line 1: Parser Error: syntax error at or near ""
LINE 1: ??c????DUCK'?+????????
        ^
Error: near line 92: Parser Error: syntax error at or near "der"
LINE 1: der
        ^
Error: near line 104: Parser Error: syntax error at or near "der"
LINE 1: der
        ^
Error: near line 112: Parser Error: syntax error at or near "storder"
LINE 1: storder	16697614321432
        ^
Error: near line 123: Parser Error: syntax error at or near "der"
LINE 1: der
        ^
Error: near line 134: Parser Error: syntax error at or near "der"
LINE 1: der669688817
        ^
Error: near line 145: Parser Error: syntax error at or near "der"
LINE 1: der?^????^?? ?^?????^???^??^??...
        ^
Error: near line 493: Parser Error: syntax error at or near "der"
LINE 1: der
        ^
Error: near line 505: Parser Error: syntax error at or near "der"
LINE 1: der
        ^
Error: near line 513: Parser Error: syntax error at or near "storder"
LINE 1: storder	16697614321432
        ^
Error: near line 524: Parser Error: syntax error at or near "der"
LINE 1: der
        ^
Error: near line 535: Parser Error: syntax error at or near "der"
LINE 1: der669688817
        ^
Error: near line 546: Parser Error: syntax error at or near "der"
LINE 1: der?^????^?? ?^?????^???^??^??...

akshaysu12 avatar Nov 30 '22 01:11 akshaysu12

I think it’s having issues replaying the WAL which again might be an encoding-related problem

jwills avatar Nov 30 '22 01:11 jwills

I am seeing issues when I try to open up the db with the duckdb cli:

./duckdb -init $HOME/analytics/dbt/gpa.duckdb

Error: near line 1: Parser Error: syntax error at or near ""
LINE 1: ??c????DUCK'?+????????
        ^
Error: near line 92: Parser Error: syntax error at or near "der"
LINE 1: der
        ^
Error: near line 104: Parser Error: syntax error at or near "der"
LINE 1: der
        ^
Error: near line 112: Parser Error: syntax error at or near "storder"
LINE 1: storder	16697614321432
        ^
Error: near line 123: Parser Error: syntax error at or near "der"
LINE 1: der
        ^
Error: near line 134: Parser Error: syntax error at or near "der"
LINE 1: der669688817
        ^
Error: near line 145: Parser Error: syntax error at or near "der"
LINE 1: der?^????^?? ?^?????^???^??^??...
        ^
Error: near line 493: Parser Error: syntax error at or near "der"
LINE 1: der
        ^
Error: near line 505: Parser Error: syntax error at or near "der"
LINE 1: der
        ^
Error: near line 513: Parser Error: syntax error at or near "storder"
LINE 1: storder	16697614321432
        ^
Error: near line 524: Parser Error: syntax error at or near "der"
LINE 1: der
        ^
Error: near line 535: Parser Error: syntax error at or near "der"
LINE 1: der669688817
        ^
Error: near line 546: Parser Error: syntax error at or near "der"
LINE 1: der?^????^?? ?^?????^???^??^??...

Is there a reason you're passing a duckdb database to a parameter that expects an SQL file?

Mause avatar Dec 02 '22 16:12 Mause

just ignorance. How do I initialize an in-memory db from an existing duckdb file?

akshaysu12 avatar Dec 02 '22 16:12 akshaysu12

@akshaysu12 In your profiles.yml you can specify existing database in the target field with the relative path.

example:
  target: dev
  outputs:
    dev:
      type: duckdb
      path: target/example.duckdb

Lyonsclay avatar Dec 02 '22 17:12 Lyonsclay

Actually maybe you are asking how to do that on the cli. In that case you wouldn't need the --init flag. ./duckdb $HOME/analytics/dbt/gpa.duckdb

Lyonsclay avatar Dec 02 '22 17:12 Lyonsclay

He's asking about the cli tho, which is just duckdb /path/to/the/existing/file.db

Thank you @Mause for spotting the -init arg that I missed!

jwills avatar Dec 02 '22 17:12 jwills

oh that is much simpler, thank you!

akshaysu12 avatar Dec 02 '22 17:12 akshaysu12