dbt-redshift icon indicating copy to clipboard operation
dbt-redshift copied to clipboard

[CT-1616] get_relation method returns None for Redshift external tables

Open tomekzbrozek opened this issue 5 years ago • 8 comments

Describe the bug

get_relation method (https://docs.getdbt.com/docs/writing-code-in-dbt/jinja-context/adapter/#get_relation) returns None when provided with database, schema and table name of a Redshift external table (AWS Spectrum).

Steps To Reproduce

  1. Create an external table with AWS Spectrum: https://docs.aws.amazon.com/redshift/latest/dg/c-spectrum-external-tables.html
  2. Attempt to get_relation on that external table.

Expected behavior

Should return a regular relation as it does for native Redshift tables, e.g. an output like analytics.spectrum_schema.mytable instead of None

Screenshots and log output

System information

Which database are you using dbt with?

  • [ ] postgres
  • [x] redshift
  • [ ] bigquery
  • [ ] snowflake
  • [ ] other (specify: - - - )

The output of dbt --version:

0.16.0

The operating system you're using: MacOS Mojave

The output of python --version: 3.6.2

Additional context

tomekzbrozek avatar Apr 02 '20 10:04 tomekzbrozek

Thanks for the report @tomekzbrozek!

Looks like we use the Postgres code to query the information schema on redshift: https://github.com/fishtown-analytics/dbt/blob/dev/octavius-catto/plugins/redshift/dbt/include/redshift/macros/adapters.sql#L154

but redshift provides a totally different set of tables for fetching this info: https://docs.aws.amazon.com/redshift/latest/dg/r_SVV_EXTERNAL_TABLES.html

I think we'd need to copy the "base" pg implementation into the redshift plugin, then update the code to also fetch external tables too.

drewbanin avatar Apr 02 '20 16:04 drewbanin

thank you, yup I also used SVV_EXTERNAL_TABLES as a workaround to the issue reported here, pasting below to give a flavour.

({{ schema }} and {{ table }} vars are defined earlier in the body of my macro that serves a different purpose, I basically needed to know if the external table exists of not, this is why I came across this issue :) )

-- check_if_external_table_exists returns 1 if a given external table exists, 0 if doesn't
{% set check_if_external_table_exists %}
SELECT
  COUNT(*) AS if_exists_flag
FROM
  SVV_EXTERNAL_TABLES
WHERE
  schemaname || '.' || tablename = '{{ schema }}.{{ table }}'
{% endset %}
-- the loop below fetches results of check_if_external_table_exists:
-- 0 if table doesn't exist, 1 if table exists, 2 for other (erroneous?) cases
-- more context on the `execute` Jinja variable:
-- https://docs.getdbt.com/docs/writing-code-in-dbt/jinja-context/execute
{% if execute %}
  {% set if_exists_flag = run_query(check_if_external_table_exists).columns[0].values()[0] | int %}
{% else %}
  {% set if_exists_flag = 2 | int %}
{% endif %}

tomekzbrozek avatar Apr 02 '20 16:04 tomekzbrozek

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

github-actions[bot] avatar Apr 11 '22 02:04 github-actions[bot]

Still a bug, still a good first issue for an interested community contributor :)

jtcohen6 avatar Sep 29 '22 16:09 jtcohen6

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

github-actions[bot] avatar Jun 05 '23 02:06 github-actions[bot]

Resolved by https://github.com/dbt-labs/dbt-redshift/pull/428

dbeatty10 avatar Jul 13 '23 12:07 dbeatty10

This might not actually be resolved by dbt-labs/dbt-core#428 😅

Will either re-open this issue or keep closed depending on the outcome of that conversation.

If it is re-opened, the solution may look similar to https://github.com/dbt-labs/dbt-postgres/issues/53, and it may make sense to solve them both at the same time.

dbeatty10 avatar Jul 13 '23 13:07 dbeatty10

FWIW I've adapted @dbeatty10's solution for Spectrum and it worked for me (using dbt-redshift==1.3.0 and dbt-core==1.3.4):

{% macro redshift__list_relations_without_caching(schema_relation) %}

  {% call statement('list_relations_without_caching', fetch_result=True) -%}
    select
        table_catalog as database,
        table_name as name,
        table_schema as schema,
        'table' as type
    from information_schema.tables
    where table_schema ilike '{{ schema_relation.schema }}'
    and table_type = 'BASE TABLE'
    union all
    select
      table_catalog as database,
      table_name as name,
      table_schema as schema,
      case
        when view_definition ilike '%create materialized view%'
          then 'materialized_view'
        else 'view'
      end as type
    from information_schema.views
    where table_schema ilike '{{ schema_relation.schema }}'
+    union all
+    select 
+        redshift_database_name as database,
+        tablename as name,
+        schemaname as schema,
+        'table' as type
+    from svv_external_tables
+    where schemaname ilike '{{ schema_relation.schema }}'
  {% endcall %}

  {{ return(load_result('list_relations_without_caching').table) }}

{% endmacro %}

trymzet avatar Nov 16 '23 20:11 trymzet