schemachange icon indicating copy to clipboard operation
schemachange copied to clipboard

Unicode issue when µ is included in a function

Open lgc-jacovanwyk opened this issue 3 years ago • 2 comments

Trying to create this function below via schema change gives me an error when running via Azure DevOps pipeline

CREATE FUNCTION PUBLIC.CONVERT_TO_UMOL_SCALE("ITEM" VARCHAR(16777216))
RETURNS FLOAT
LANGUAGE JAVASCRIPT
AS '
    if (ITEM) {
        var match = /([\\d\\.]+)\\s*(u|µ|n)mol/i.exec(ITEM);
        if (match) {
            console.log(match);
            if (["u", "U", "µ"].includes(match[2])) return parseFloat(match[1]);
            return parseFloat(match[1]) / 1000;
        }
    }
    return null;
    ';

Error details from Stack Trace. Note my script is part of a larger file so not sure if the position 3411 refers to schemachange code or combination of my script and schemachange.

Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.8.12/x64/bin/schemachange", line 8, in <module>
    sys.exit(main())
  File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/schemachange/cli.py", line 527, in main
    schemachange(args.config_folder, args.root_folder, args.snowflake_account, args.snowflake_user, args.snowflake_role, args.snowflake_warehouse, args.snowflake_database, args.change_history_table, args.vars, args.create_change_history_table, args.autocommit, args.verbose, args.dry_run)
  File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/schemachange/cli.py", line 154, in schemachange
    apply_change_script(script, config['vars'], config['snowflake-database'], change_history_table, snowflake_session_parameters, config['autocommit'], config['verbose'])
  File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/schemachange/cli.py", line 482, in apply_change_script
    content = get_script_contents_with_variable_replacement(script['script_full_path'], vars, verbose)
  File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/schemachange/cli.py", line 472, in get_script_contents_with_variable_replacement
    content = content_file.read().strip()
  File "/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5 in position 3411: invalid start byte

When I removed the µ symbol, it worked

lgc-jacovanwyk avatar Oct 21 '21 11:10 lgc-jacovanwyk

Hey there @lgc-jacovanwyk, thanks for the suggestion. Any ideas for how to fix?

sfc-gh-jhansen avatar Nov 07 '21 05:11 sfc-gh-jhansen

Maybe use hex character code instead of the actual unicode letter in your code

kmichas79 avatar Feb 15 '22 09:02 kmichas79