snowplow-rdb-loader icon indicating copy to clipboard operation
snowplow-rdb-loader copied to clipboard

Change `domain_sessionid` to be a varchar instead of a `char`

Open rlh1994 opened this issue 7 months ago • 0 comments

Currently for redshift and snowflake the domain_sessionid column is loaded as a char(128) when most other columns are a varchar(128). In snowflake this doesn't actually matter as snowflake does not pad the string (https://docs.snowflake.com/en/sql-reference/data-types-text#char-character-nchar) but in redshift this padding does occur, which can lead to confusion from users and edge case issues with data models, it also has a slightly higher storage cost when compared to a varchar.

Given no other columns are a char I don't foresee a reason that this needs to remain as one.

This is everywhere I can see the definition made: https://github.com/snowplow/snowplow-rdb-loader/blob/05dd5ab11360d52196605d83d724af16b57a95f3/modules/redshift-loader/src/main/scala/com/snowplowanalytics/snowplow/loader/redshift/RedshiftEventsTable.scala#L134

https://github.com/snowplow/snowplow-rdb-loader/blob/05dd5ab11360d52196605d83d724af16b57a95f3/modules/snowflake-loader/src/main/scala/com/snowplowanalytics/snowplow/loader/snowflake/SnowflakeEventsTable.scala#L134

https://github.com/snowplow/snowplow-rdb-loader/blob/05dd5ab11360d52196605d83d724af16b57a95f3/modules/snowflake-loader/src/main/resources/atomic-def.sql#L159

https://github.com/snowplow/snowplow-rdb-loader/blob/05dd5ab11360d52196605d83d724af16b57a95f3/modules/snowflake-loader/src/main/scala/com/snowplowanalytics/snowplow/loader/snowflake/ast/AtomicDef.scala#L193

rlh1994 avatar Dec 01 '23 08:12 rlh1994