sqlmesh icon indicating copy to clipboard operation
sqlmesh copied to clipboard

Support automatic table properties for Redshift/Snowflake and Incremental by Time models

Open eakmanrq opened this issue 1 year ago • 2 comments

Snowflake should define the time columns as part of the cluster_by key. Redshift should define it as part of it's sort_key key. We need to make sure that if a user defines these in the table properties that it gets merged properly against what is being automatically defined by the model.

Also note that we need to establish canonical terms for these things to make them consistent across engines. Proposal: Partition By:

  • Spark: Partition By
  • Redshift: Not implemented
  • Snowflake: Not implemented
  • Bigquery: Partition By

Sort By:

  • Spark (Delta): Z-order
  • Redshift: Sort Key
  • Snowflake: Cluster By
  • Bigquery: Cluster By

Dist By:

  • Spark: Not Implemented
  • Redshift: Dist Key
  • Snowflake: Not Implemented
  • Bigquery: Not Implemented

So for example with partition by time the reasoning for the engine adapter would be this: "If I support partition by, then use that for the time column. If not, use Sort By. If I support neither then do nothing"

eakmanrq avatar Sep 07 '23 17:09 eakmanrq