snowplow-rdb-loader
snowplow-rdb-loader copied to clipboard
Change `domain_sessionid` to be a varchar instead of a `char`
Currently for redshift and snowflake the domain_sessionid
column is loaded as a char(128)
when most other columns are a varchar(128)
. In snowflake this doesn't actually matter as snowflake does not pad the string (https://docs.snowflake.com/en/sql-reference/data-types-text#char-character-nchar) but in redshift this padding does occur, which can lead to confusion from users and edge case issues with data models, it also has a slightly higher storage cost when compared to a varchar.
Given no other columns are a char
I don't foresee a reason that this needs to remain as one.
This is everywhere I can see the definition made: https://github.com/snowplow/snowplow-rdb-loader/blob/05dd5ab11360d52196605d83d724af16b57a95f3/modules/redshift-loader/src/main/scala/com/snowplowanalytics/snowplow/loader/redshift/RedshiftEventsTable.scala#L134
https://github.com/snowplow/snowplow-rdb-loader/blob/05dd5ab11360d52196605d83d724af16b57a95f3/modules/snowflake-loader/src/main/scala/com/snowplowanalytics/snowplow/loader/snowflake/SnowflakeEventsTable.scala#L134
https://github.com/snowplow/snowplow-rdb-loader/blob/05dd5ab11360d52196605d83d724af16b57a95f3/modules/snowflake-loader/src/main/resources/atomic-def.sql#L159
https://github.com/snowplow/snowplow-rdb-loader/blob/05dd5ab11360d52196605d83d724af16b57a95f3/modules/snowflake-loader/src/main/scala/com/snowplowanalytics/snowplow/loader/snowflake/ast/AtomicDef.scala#L193