yamllint
yamllint copied to clipboard
Can not parse utf-8 strings
Traceback (most recent call last):
File "C:\msys64\mingw64\lib\python3.8\runpy.py", line 192, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\msys64\mingw64\lib\python3.8\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "c:\msys64\mingw64\bin\yamllint.exe\__main__.py", line 7, in <module>
File "C:\msys64\mingw64\lib\python3.8\site-packages\yamllint\cli.py", line 189, in run
prob_level = show_problems(problems, 'stdin', args_format=args.format)
File "C:\msys64\mingw64\lib\python3.8\site-packages\yamllint\cli.py", line 91, in show_problems
for problem in problems:
File "C:\msys64\mingw64\lib\python3.8\site-packages\yamllint\linter.py", line 198, in _run
syntax_error = get_syntax_error(buffer)
File "C:\msys64\mingw64\lib\python3.8\site-packages\yamllint\linter.py", line 179, in get_syntax_error
list(yaml.parse(buffer, Loader=yaml.BaseLoader))
File "C:\msys64\mingw64\lib\python3.8\site-packages\yaml\__init__.py", line 73, in parse
loader = Loader(stream)
File "C:\msys64\mingw64\lib\python3.8\site-packages\yaml\loader.py", line 14, in __init__
Reader.__init__(self, stream)
File "C:\msys64\mingw64\lib\python3.8\site-packages\yaml\reader.py", line 74, in __init__
self.check_printable(stream)
File "C:\msys64\mingw64\lib\python3.8\site-packages\yaml\reader.py", line 143, in check_printable
raise ReaderError(self.name, position, ord(character),
yaml.reader.ReaderError: unacceptable character #xdc82: special characters are not allowed
in "<unicode string>", position 279
I know https://github.com/adrienverge/yamllint/issues/20 and https://github.com/adrienverge/yamllint/issues/2. But it's on non-Windows. On Windows, LANG, LC_CTYPE does not set in generally. I think yamllint should provide way to read utf-8 string even if LANG/LC_CTYPE is not set.
Can you provide a way to reproduce your problem, especially an input file that triggers the error + a yamllint version?
test.yaml
---
テスト: 'コード'
C:\temp>yamllint -v
yamllint 1.19.0
C:\>temp>yamllint test.yaml
Traceback (most recent call last):
File "C:\msys64\mingw64\lib\python3.8\runpy.py", line 192, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\msys64\mingw64\lib\python3.8\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "c:\msys64\mingw64\bin\yamllint.exe\__main__.py", line 7, in <module>
File "C:\msys64\mingw64\lib\python3.8\site-packages\yamllint\cli.py", line 175, in run
problems = linter.run(f, conf, filepath)
File "C:\msys64\mingw64\lib\python3.8\site-packages\yamllint\linter.py", line 237, in run
content = input.read()
UnicodeDecodeError: 'cp932' codec can't decode byte 0x86 in position 6: illegal multibyte sequence
PYTHONIOENCODING=UTF-8 can fix this for stdin
C:\temp>yamllint - < test.yaml
Traceback (most recent call last):
File "C:\msys64\mingw64\lib\python3.8\runpy.py", line 192, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\msys64\mingw64\lib\python3.8\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "c:\msys64\mingw64\bin\yamllint.exe\__main__.py", line 7, in <module>
File "C:\msys64\mingw64\lib\python3.8\site-packages\yamllint\cli.py", line 189, in run
prob_level = show_problems(problems, 'stdin', args_format=args.format)
File "C:\msys64\mingw64\lib\python3.8\site-packages\yamllint\cli.py", line 91, in show_problems
for problem in problems:
File "C:\msys64\mingw64\lib\python3.8\site-packages\yamllint\linter.py", line 198, in _run
syntax_error = get_syntax_error(buffer)
File "C:\msys64\mingw64\lib\python3.8\site-packages\yamllint\linter.py", line 179, in get_syntax_error
list(yaml.parse(buffer, Loader=yaml.BaseLoader))
File "C:\msys64\mingw64\lib\python3.8\site-packages\yaml\__init__.py", line 73, in parse
loader = Loader(stream)
File "C:\msys64\mingw64\lib\python3.8\site-packages\yaml\loader.py", line 14, in __init__
Reader.__init__(self, stream)
File "C:\msys64\mingw64\lib\python3.8\site-packages\yaml\reader.py", line 74, in __init__
self.check_printable(stream)
File "C:\msys64\mingw64\lib\python3.8\site-packages\yaml\reader.py", line 143, in check_printable
raise ReaderError(self.name, position, ord(character),
yaml.reader.ReaderError: unacceptable character #xdc86: special characters are not allowed
in "<unicode string>", position 5
C:\temp>set PYTHONIOENCODING=UTF-8
C:\temp>yamllint - < test.yaml
But file inupt still wrong.
C:\temp>yamllint test.yaml
Traceback (most recent call last):
File "C:\msys64\mingw64\lib\python3.8\runpy.py", line 192, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\msys64\mingw64\lib\python3.8\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "c:\msys64\mingw64\bin\yamllint.exe\__main__.py", line 7, in <module>
File "C:\msys64\mingw64\lib\python3.8\site-packages\yamllint\cli.py", line 175, in run
problems = linter.run(f, conf, filepath)
File "C:\msys64\mingw64\lib\python3.8\site-packages\yamllint\linter.py", line 237, in run
content = input.read()
UnicodeDecodeError: 'cp932' codec can't decode byte 0x86 in position 6: illegal multibyte sequence
On Linux, your example file works perfectly. It looks like Windows default encoding is not Unicode.
yamllint uses PyYAML to parse YAML, could you try the following command, to see if PyYAML is able to load the file?
python -c 'import yaml; yaml.safe_load(open("test.yaml").read());'
C:\temp>python -c "import yaml; yaml.safe_load(open('test.yaml').read());"
Traceback (most recent call last):
File "<string>", line 1, in <module>
UnicodeDecodeError: 'cp932' codec can't decode byte 0x86 in position 6: illegal multibyte sequence
Might be related to https://github.com/yaml/pyyaml/issues/123#issuecomment-395431735. Probably the following would work.
python -c 'import yaml; yaml.safe_load(open("test.yaml", encoding="utf8").read());'
I confirmed @rhysd 's code work.
I'm doing some issue gardening 🌱🌿 🌷 and came upon this issue. Since it's quite old I just wanted to ask if this is still relevant? If it isn't, maybe we can close this issue?
By closing some old issues we reduce the list of open issues to a more manageable set.
I think it's related to https://github.com/adrienverge/yamllint/pull/238, https://github.com/adrienverge/yamllint/pull/239 and https://github.com/adrienverge/yamllint/pull/240, and should not be closed (or closed as duplicate, if confirmed).