dbt-sqlserver icon indicating copy to clipboard operation
dbt-sqlserver copied to clipboard

Slow performance when materialization as table (`SELECT * INTO Model__dbt_tmp FROM Model__dbt_tmp_temp_view`)

Open romiof opened this issue 1 year ago • 5 comments

Hello!

This is a bug and a suggestion for improvement... I'm suffering with slow performance when materialization as table.

My environment

I started this month to use dbt for MS SQL. My database is a SQL Server 2014. Using the last version of dbt 1.4.3

My issue

I'm having problems whit materialization macro which make the tables at our SQL Server. Its performance is very, very slow.

More specific, my project consist in:

  • Using source yml files, to map my tables to dbt objects
  • Created views for stage layer, with this views querying my source tables. as suggested for dbt team here
  • Created my Dimensions and Fact entities as tables
  • Here, when I simply make a SELECT * on one model SQL, in the worst case, it takes less than 1 minute to return all lines.
  • But when using dbt run / dbt build it performs very poor.

What I investigated

  • In my case, the _Model__dbt_tmp_temp_view_ is a nested view, and this is the origin of my problem.
  • With a simple SELECT * FROM _Model__dbt_tmp_temp_view_ , everything works fine, the performance is equal when I select without the temp view.
  • But when using the INTO Model__dbt_tmp (https://github.com/dbt-msft/dbt-sqlserver/blob/master/dbt/include/sqlserver/macros/materializations/models/table/create_table_as.sql#LL23-L25) my SQL change its plan and it takes about 10 minutes to execute this materialization.
  • In last two days, I make a lot of tests, changed my JOINs, refactorated my SELECT but when I use nested views (because I'm following dbt best pratice) I got this slow performance. I'm not a expert to debug my execution plan, just I realized that these two queries make different plans ( SELECT * FROM _Model__dbt_tmp_temp_view_ vs SELECT * INTO Model__dbt_tmp FROM _Model__dbt_tmp_temp_view_)
  • Maybe it is a glitch in our old SQL Server 2014

A possible workaround

I also discovered, that there is a hint which make SELECT * INTO Model__dbt_tmp FROM _Model__dbt_tmp_temp_view_ runs with its original perfomance: OPTION (FORCE ORDER)

I founded it from this SO post.

So, if I run this query SELECT * INTO Model__dbt_tmp FROM _Model__dbt_tmp_temp_view_ OPTION (FORCE ORDER) from my MS SQL Studio, it create my Model table in less than 1 minute.

Setting this option at dbt-sqlserver

I'd like to know, how can we change this macro to have a possibility to include this hint ? Maybe a configuration, like as_columnstore ?

romiof avatar May 29 '23 12:05 romiof