soda-sql
soda-sql copied to clipboard
REGEX use on a collated VARCHAR column in Snowflake causes error
Describe the bug If a Snowflake VARCHAR column is defined with collation, REGEX functions cause an error.
To Reproduce Steps to reproduce the behavior:
- Create a VARCHAR column in a Snowflake table
- Run
soda analyze ... - Or write scan file with valid_regex entry for column
- Run
soda scan ...using that scan file
Context
Snowflake does not support REGEX on collated columns.
Collation can be removed from a column by wrapping the expression in,
COLLATE({expr}, '')
OS: Mac OS Big Sur version 11.6 Python Version: Python 3.9.10 Soda SQL Version: 2.1.2 Warehouse Type: Snowflake
Fix is in the branch 665-snowflake-collation. Though this removes collation from all VARCHAR columns, collated or not. While this is benign, it is unnecessary. Would it be more appropriate to have a conditional or some other way to select this feature?