schemachange
schemachange copied to clipboard
Unicode issue when µ is included in a function
Trying to create this function below via schema change gives me an error when running via Azure DevOps pipeline
CREATE FUNCTION PUBLIC.CONVERT_TO_UMOL_SCALE("ITEM" VARCHAR(16777216))
RETURNS FLOAT
LANGUAGE JAVASCRIPT
AS '
if (ITEM) {
var match = /([\\d\\.]+)\\s*(u|µ|n)mol/i.exec(ITEM);
if (match) {
console.log(match);
if (["u", "U", "µ"].includes(match[2])) return parseFloat(match[1]);
return parseFloat(match[1]) / 1000;
}
}
return null;
';
Error details from Stack Trace. Note my script is part of a larger file so not sure if the position 3411 refers to schemachange code or combination of my script and schemachange.
Traceback (most recent call last):
File "/opt/hostedtoolcache/Python/3.8.12/x64/bin/schemachange", line 8, in <module>
sys.exit(main())
File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/schemachange/cli.py", line 527, in main
schemachange(args.config_folder, args.root_folder, args.snowflake_account, args.snowflake_user, args.snowflake_role, args.snowflake_warehouse, args.snowflake_database, args.change_history_table, args.vars, args.create_change_history_table, args.autocommit, args.verbose, args.dry_run)
File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/schemachange/cli.py", line 154, in schemachange
apply_change_script(script, config['vars'], config['snowflake-database'], change_history_table, snowflake_session_parameters, config['autocommit'], config['verbose'])
File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/schemachange/cli.py", line 482, in apply_change_script
content = get_script_contents_with_variable_replacement(script['script_full_path'], vars, verbose)
File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/schemachange/cli.py", line 472, in get_script_contents_with_variable_replacement
content = content_file.read().strip()
File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5 in position 3411: invalid start byte
When I removed the µ symbol, it worked
Hey there @lgc-jacovanwyk, thanks for the suggestion. Any ideas for how to fix?
Maybe use hex character code instead of the actual unicode letter in your code