dbt-starrocks icon indicating copy to clipboard operation
dbt-starrocks copied to clipboard

Bug: "DROP UNKNOWN" error on second run for table materializations with StarRocks >= 3.4

Open progammer3000 opened this issue 6 months ago • 3 comments

When running a dbt model with materialized='table' against a recent version of StarRocks (e.g., 3.4.3), the first run succeeds, but all subsequent runs fail with a SQL syntax error: No viable statement for input 'drop unknown'.

This is caused by a change in StarRocks' information_schema.tables, where the table_type for a standard table is now 'TABLE' instead of 'BASE TABLE'. The dbt-starrocks adapter's metadata query doesn't recognize 'TABLE' and misclassifies the existing object as 'unknown', leading to a malformed DROP statement (DROP UNKNOWN my_schema.my_table) before attempting to recreate it.

Steps to Reproduce:

  • Set up Environment:

    1. StarRocks version: 3.4.3-a01aa59 (tested using starrocks/allin1-ubuntu:latest container image).
    2. dbt-starrocks adapter (e.g., version 1.9.0 or later).

Create a dbt model (models/my_model.sql):

{{ config(
    materialized = 'table',
    table_type='DUPLICATE',
    order_by=['event_time', 'event_type']
) }}
SELECT
    event_time,
    event_type,
    user_id,
    device_code,
    channel
FROM {{ source('source', 'raw_detail') }}

(Assume source.raw_detail is a valid, existing table, as it was in my case) Run dbt for the first time: dbt run --select my_model This command succeeds, and the table my_model is created in the target schema.

Run dbt for the second time: dbt run --select my_model This command fails.

Expected behavior: The second dbt run should succeed. dbt should correctly identify the existing table, issue a DROP TABLE statement, and then recreate it.

Actual behavior: The second run fails with the following error:

Database Error in model my_model (models/my_model.sql)
  1064 (HY000): Getting syntax error at line 2, column 5. Detail message: No viable statement for input 'drop unknown'.

System-Level Information dbt-starrocks version: 1.10.0 StarRocks Version: 3.4.3-a01aa59 dbt Core version: 1.9.6

The root cause is in the adapter's primary metadata macro, located at dbt/include/starrocks/macros/adapters/metadata.sql.

The current CASE statement for identifying table types is:

case when tbl.table_type = 'BASE TABLE' then 'table'
     when tbl.table_type = 'VIEW' and mv.table_name is null then 'view'
     when tbl.table_type = 'VIEW' and mv.table_name is not null then 'materialized_view'
     when tbl.table_type = 'SYSTEM VIEW' then 'system_view'
     else 'unknown' end as table_type

On StarRocks 3.4.3, a query to the information_schema reveals the problem: -- SQL Query SELECT table_name, table_type, table_comment FROM information_schema.tables WHERE table_schema = 'analytics' AND table_name = 'my_model';

-- Result
+------------+------------+---------------+
| table_name | table_type | table_comment |
+------------+------------+---------------+
| my_model   | TABLE      |               |
+------------+------------+---------------+

Since table_type is 'TABLE' and not 'BASE TABLE', the CASE statement falls through to the else 'unknown' clause.

Proposed fix: To fix this while maintaining backward compatibility with older StarRocks versions, the CASE statement should be updated to accept both 'TABLE' and 'BASE TABLE'.

Change this line: case when tbl.table_type = 'BASE TABLE' then 'table' To this: case when tbl.table_type IN ('BASE TABLE', 'TABLE') then 'table' I have tested this change locally by modifying the macro file, and it resolves the issue completely. Subsequent runs now succeed as expected.

progammer3000 avatar Jun 06 '25 02:06 progammer3000

Same here

livelif avatar Jun 12 '25 13:06 livelif

Like @progammer3000 said, you just need as a temporary workaround, you can override the default behavior by defining the following macro in your project:

{% macro starrocks__list_relations_without_caching(schema_relation) -%}

  {% call statement('list_relations_without_caching', fetch_result=True) %}
    select
      null as "database",
      tbl.table_name as name,
      tbl.table_schema as "schema",
      case
        when tbl.table_type IN ('BASE TABLE', 'TABLE') then 'table' 
        when tbl.table_type = 'VIEW' and mv.table_name is null then 'view'
        when tbl.table_type = 'VIEW' and mv.table_name is not null then 'materialized_view'
        when tbl.table_type = 'SYSTEM VIEW' then 'system_view'
        else 'unknown'
      end as table_type
    from information_schema.tables tbl
    left join information_schema.materialized_views mv
      on tbl.TABLE_SCHEMA = mv.TABLE_SCHEMA
     and tbl.TABLE_NAME = mv.TABLE_NAME
    where tbl.table_schema = '{{ schema_relation.schema }}'
  {% endcall %}

  {{ return(load_result('list_relations_without_caching').table) }}

{%- endmacro %}

livelif avatar Jun 12 '25 14:06 livelif

Thanks @livelif for your efforts , I actually raised a pull request to fix the issue last week : https://github.com/StarRocks/dbt-starrocks/pull/71.

progammer3000 avatar Jun 12 '25 15:06 progammer3000