omicsplayground
omicsplayground copied to clipboard
character encoding warning
In running the OPG on DNAnexus, we got a lot of char encoding warning of special symbols like beta, gamma, delta. I just wonder how we should tackle that. Lots of these symbols are apparently common place in biology.
input string '[GSE30589] induce ifnβ:YES_vs_NO' cannot be translated to UTF-8, is it valid in 'ANSI_X3.4-1968'? Warning in load(pgxfile1, verbose = 0) : input string '[GSE60377] genotype:Setdb1Δ-KO_vs_other' cannot be translated to UTF-8, is it valid in 'ANSI_X3.4-1968'? Warning in load(pgxfile1, verbose = 0) : input string '[GSE60377] genotype:Setdb1Δ+HET1_vs_other' cannot be translated to UTF-8, is it valid in 'ANSI_X3.4-1968'? Warning in load(pgxfile1, verbose = 0) : input string '[GSE60377] genotype:Setdb1Δ+HET1_vs_other' cannot be translated to UTF-8, is it valid in 'ANSI_X3.4-1968'? Warning in load(pgxfile1, verbose = 0) : input string '[GSE30589] induce ifnβ:YES_vs_NO' cannot be translated to UTF-8, is it valid in 'ANSI_X3.4-1968'? Warning in load(pgxfile1, verbose = 0) : input string '[GSE60377] genotype:Setdb1Δ-KO_vs_other' cannot be translated to UTF-8, is it valid in 'ANSI_X3.4-1968'? Warning in load(pgxfile1, verbose = 0) : input string '[GSE30589] induce ifnβ:YES_vs_NO' cannot be translated to UTF-8, is it valid in 'ANSI_X3.4-1968'? Warning in load(pgxfile1, verbose = 0) : input string '[GSE60377] genotype:Setdb1Δ+HET1_vs_other' cannot be translated to UTF-8, is it valid in 'ANSI_X3.4-1968'? Warning in load(pgxfile1, verbose = 0) : input string '[GSE60377] genotype:Setdb1Δ-KO_vs_other' cannot be translated to UTF-8, is it valid in 'ANSI_X3.4-1968'? Warning in load(pgxfile1, verbose = 0) : input string '[GSE60377] genotype:Setdb1Δ-KO_vs_other' cannot be translated to UTF-8, is it valid in 'ANSI_X3.4-1968'? Warning in load(pgxfile1, verbose = 0) : input string '[GSE60377] genotype:Setdb1Δ+HET1_vs_other' cannot be translated to UTF-8, is it valid in 'ANSI_X3.4-1968'? Warning in load(pgxfile1, verbose = 0) : input string '[GSE30589] induce ifnβ:YES_vs_NO' cannot be translated to UTF-8, is it valid in 'ANSI_X3.4-1968'? Traceback (most recent call last): File "/usr/local/bin/dx-log-stream", line 70, in
log_function(line.rstrip("\n")) File "/usr/lib/python3.8/logging/init.py", line 2082, in info root.info(msg, *args, **kwargs) File "/usr/lib/python3.8/logging/init.py", line 1446, in info self._log(INFO, msg, args, **kwargs) File "/usr/lib/python3.8/logging/init.py", line 1589, in _log self.handle(record) File "/usr/lib/python3.8/logging/init.py", line 1599, in handle self.callHandlers(record) File "/usr/lib/python3.8/logging/init.py", line 1661, in callHandlers hdlr.handle(record) File "/usr/lib/python3.8/logging/init.py", line 954, in handle self.emit(record) File "/usr/local/lib/python3.8/dist-packages/dxpy/dxlog.py", line 101, in emit message = self.truncate_message(message) File "/usr/local/lib/python3.8/dist-packages/dxpy/dxlog.py", line 73, in truncate_message msg_bytes = message if USING_PYTHON2 else message.encode('utf-8') UnicodeEncodeError: 'utf-8' codec can't encode character '\udcd0' in position 123: surrogates not allowed