yamllint icon indicating copy to clipboard operation
yamllint copied to clipboard

Lacking robustness when opening non-YAML files

Open tamere-allo-peter opened this issue 2 years ago • 3 comments

As can be seen below yamllint should ignore non-YAML (to the best of its knowledge) files, or at least fail gracefully, in all cases, but currently it only works when you pass a directory on the command line.

In the example below a directory contains some YAML files as well as a JPEG image file. The yamllint * command fails, even if I specify a config filename which explicitely lists allowed filenames patterns.

The command only works correctly if I specify the name of the directory (e.g. ~/work instead of * on the command line.

jerome@OPT17844:~/work$ yamllint -v
yamllint 1.26.3

jerome@OPT17844:~/work$ ls -la
total 96
drwxrwxr-x  2 jerome jerome   147 mai   13 11:23 .
drwxrwx--- 91 jerome jerome 12288 mai   13 11:23 ..
-rw-rw-r--  1 jerome jerome 60229 mai   13 11:23 1.jpeg
-rw-rw-r--  1 jerome jerome   382 mai   13 11:23 bad.yml
-rw-rw-r--  1 jerome jerome     0 mai   13 11:23 empty.yml
-rw-rw-r--  1 jerome jerome   132 mai   13 11:23 fakeansibleplaybook.yml
-rw-rw-r--  1 jerome jerome    16 mai   13 11:23 fakeansiblevault.yml
-rw-rw-r--  1 jerome jerome   381 mai   13 11:23 good.yml
-rw-rw-r--  1 jerome jerome    27 mai   13 11:23 withtabs.yml

jerome@OPT17844:~/work$ yamllint *
Traceback (most recent call last):
  File "/home/jerome/.local/bin/yamllint", line 8, in <module>
    sys.exit(run())
  File "/home/jerome/.local/lib/python3.8/site-packages/yamllint/cli.py", line 206, in run
    problems = linter.run(f, conf, filepath)
  File "/home/jerome/.local/lib/python3.8/site-packages/yamllint/linter.py", line 240, in run
    content = input.read()
  File "/usr/lib/python3.8/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

jerome@OPT17844:~/work$ cat ~/.yamllint 
---

extends: default

yaml-files:
  - '*.yaml'
  - '*.yml'
  - '.yamllint'

rules:
  line-length:
    max: 160

jerome@OPT17844:~/work$ yamllint --config-file ~/.yamllint *
Traceback (most recent call last):
  File "/home/jerome/.local/bin/yamllint", line 8, in <module>
    sys.exit(run())
  File "/home/jerome/.local/lib/python3.8/site-packages/yamllint/cli.py", line 206, in run
    problems = linter.run(f, conf, filepath)
  File "/home/jerome/.local/lib/python3.8/site-packages/yamllint/linter.py", line 240, in run
    content = input.read()
  File "/usr/lib/python3.8/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
jerome@OPT17844:~/work$

I can see several approaches but maybe the best would be to combine them all :

  1. Either add return () around this line https://github.com/adrienverge/yamllint/blob/c268a82c5a72921ea3598e745341b23fa7d13e58/yamllint/linter.py#L240 whenever there's an UnicodeDecodeError in order to simply ignore files which can't be processed. This would not ignore all non-YAML files though, and this could cause other problems depending on the encoding used.
  2. Or probably better do the yield only in an if conf.is_yaml_file(item) test around this line https://github.com/adrienverge/yamllint/blob/c268a82c5a72921ea3598e745341b23fa7d13e58/yamllint/cli.py#L41

Maybe a combined approach would be better because otherwise with 2 only the traceback would probably still appears if no configuration file is available.

In any case, yamllint should ignore files that can't be YAML, silently or not, but it shouldn't fail with a traceback.

Hoping this helps...

tamere-allo-peter avatar May 13 '22 00:05 tamere-allo-peter

Seeing a similar issue on windows

> yamllint -f colored . 
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "~\scoop\apps\python\current\Scripts\yamllint.exe\__main__.py", line 7, in <module>
  File "~\scoop\apps\python\current\Lib\site-packages\yamllint\cli.py", line 217, in run
    problems = linter.run(f, conf, filepath)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~\scoop\apps\python\current\Lib\site-packages\yamllint\linter.py", line 232, in run
    content = input.read()
              ^^^^^^^^^^^^
  File "~\scoop\apps\python\current\Lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 2116: character maps to <undefined>

viceice avatar Apr 17 '23 10:04 viceice

It would be convenient if yamllint least printed the pathname of the file that resulted in the error.

nathanweeks avatar Jun 28 '23 11:06 nathanweeks

Can confirm this is still an issue. If I pass all my root folders one-by-one there if no issue, but when I run pre-commit run yamllint -a I am getting getting the error

Traceback (most recent call last):
  File "/Users/c.vd.kerk/.cache/pre-commit/repoxkcn1coq/py_env-python3.11/bin/yamllint", line 8, in <module>
    sys.exit(run())
             ^^^^^
  File "/Users/c.vd.kerk/.cache/pre-commit/repoxkcn1coq/py_env-python3.11/lib/python3.11/site-packages/yamllint/cli.py", line 234, in run
    problems = linter.run(f, conf, filepath)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/c.vd.kerk/.cache/pre-commit/repoxkcn1coq/py_env-python3.11/lib/python3.11/site-packages/yamllint/linter.py", line 233, in run
    content = input.read()
              ^^^^^^^^^^^^
  File "<frozen codecs>", line 322, in decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

CaspervdKerk avatar Sep 14 '23 11:09 CaspervdKerk