dbt-project-evaluator icon indicating copy to clipboard operation
dbt-project-evaluator copied to clipboard

raw table reference fact model

Open dave-connors-3 opened this issue 1 year ago • 2 comments

Describe the feature

Create a model that parses SQL using regex to identify raw table references in model code

Who will this benefit?

All y'all

dave-connors-3 avatar Aug 12 '22 16:08 dave-connors-3

➕ To this one! I would love to see a table like the following:

model explicit_reference
lost_revenue raw_tpch.tpch_sf1.orders
lost_revenue raw_tpch.tpch_sf1.customer
lost_revenue raw_tpch.tpch_sf1.lineitem
lost_revenue raw_tpch.tpch_sf1.nation
UK_Lowcost_Brass_Suppliers raw_tpch.tpch_sf1.partsupp

Throwing some example code here which uses the tpch_sf1 data sets, although the second references some other models I have in my project. Feel free to ping me when this is being worked on if you need access to the code for testing! I should have a complete version then, but right now these models are a WIP for a Coalesce workshop so there's not a good version I can link here.

I have two models which use explicit references: lost_revenue.sql:

WITH

-- Only orders which tie back to a real customer record
ro as (
    select
        o.o_orderkey,
        c.c_custkey
    from raw_tpch.tpch_sf1.orders o
    inner join raw_tpch.tpch_sf1.customer c
        on o.o_custkey = c.c_custkey
),

rl as (
    SELECT c.c_custkey
    ,c.c_name
    ,sum(revenue) AS revenue_lost
    ,c_acctbal
    ,n.n_name 
    ,c_address 
    ,c_phone 
    ,c_comment
    FROM ro left join (SELECT l.l_orderkey AS ORDER_ID, o.o_custkey AS CUSTOMER_ID, SUM(l.l_extendedprice * (1 - l.l_discount)) AS REVENUE
        FROM raw_tpch.tpch_sf1.lineitem l LEFT JOIN raw_tpch.tpch_sf1.orders o ON l.l_orderkey = o.o_orderkey
        WHERE o.o_orderdate >= '1994-01-01' AND o.o_orderdate < '1994-04-01' AND l.l_returnflag = 'R' GROUP BY o.o_custkey, l.l_orderkey
    ) lo on lo.ORDER_ID = ro.o_orderkey and lo.CUSTOMER_ID = ro.c_custkey
    LEFT JOIN raw_tpch.tpch_sf1.customer c ON c.c_custkey = lo.CUSTOMER_ID LEFT JOIN raw_tpch.tpch_sf1.nation n ON c.c_nationkey = n.n_nationkey
    WHERE lo.CUSTOMER_ID is not null GROUP BY c.c_custkey,c.c_name,c.c_acctbal,c.c_phone,n.n_name,c.c_address,c.c_comment ORDER BY revenue_lost DESC
)

select * from rl

uk_lowcost_brass_suppliers.sql:

select 
    e.p_name            AS Part_Name,
    e.p_retailprice     AS RetailPrice,
    e.s_name            AS Supplier_Name,
    e.p_mfgr            AS Part_Manufacturer,
    e.s_address         AS SuppAddr,
    e.s_phone           AS Supp_Phone,
    ps.PS_AVAILQTY      AS Num_Available 

from {{ ref('EUR_LOWCOST_BRASS_SUPPLIERS') }} e
LEFT JOIN {{ source('TPCH_SF1', 'SUPPLIER') }} s on e.S_NAME = s.S_NAME
LEFT JOIN
    raw_tpch.tpch_sf1.partsupp ps
on e.P_PARTKEY = ps.PS_PARTKEY
    and s.S_SUPPKEY = ps.PS_SUPPKEY

where n_name = 'UNITED KINGDOM'

christineberger avatar Aug 12 '22 19:08 christineberger

I will give this one a go in the next days/weeks.

b-per avatar Sep 13 '22 07:09 b-per