bandit icon indicating copy to clipboard operation
bandit copied to clipboard

Reports that contain Unicode will cause Bandit pre-commit hook to crash

Open Dantos7 opened this issue 1 year ago • 3 comments

Describe the bug

When using the pre-commit hook on a file with unicode characters in it, Bandit will crash.

The bug was fixed for CLI in #362 but the issue is still present when using the pre-commit hook.

Reproduction steps

  1. Initialize a Git repository:
git init
  1. Create the Python file test.py:
secret = u'Don\'t👏hard👏code👏secrets'
  1. Create the file .pre-commit-config.yaml:
repos:
  - repo: https://github.com/PyCQA/bandit
    rev: '1.7.5'
    hooks:
    - id: bandit
  1. Install the pre-commit Python package (e.g., with pipx)

  2. Install the pre-commit hooks:

pre-commit install
  1. Stage both files:
git add .
  1. Run the pre-commit hook:
pre-commit run --all-files
  1. Observe the resulting crash and traceback:
bandit...................................................................Failed
- hook id: bandit
- exit code: 1

[main]  INFO    profile include tests: None
[main]  INFO    profile exclude tests: None
[main]  INFO    cli include tests: None
[main]  INFO    cli exclude tests: None
[main]  INFO    running on Python 3.11.3
[node_visitor]  WARNING Unable to find qualified name for module: test.py
Traceback (most recent call last):
  File "C:\Users\<username>\.cache\pre-commit\repowg97az4b\py_env-python3\Lib\site-packages\bandit\core\manager.py", line 188, in output_results
    report_func(
  File "C:\Users\<username>\.cache\pre-commit\repowg97az4b\py_env-python3\Lib\site-packages\bandit\formatters\text.py", line 196, in report     
    wrapped_file.write(result)
  File "C:\Users\<username>\AppData\Local\Programs\Python\Python311\Lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'charmap' codec can't encode character '\U0001f44f' in position 135: character maps to <undefined>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\<username>\.cache\pre-commit\repowg97az4b\py_env-python3\Scripts\bandit.EXE\__main__.py", line 7, in <module>
  File "C:\Users\<username>\.cache\pre-commit\repowg97az4b\py_env-python3\Lib\site-packages\bandit\cli\main.py", line 672, in main
    b_mgr.output_results(
  File "C:\Users\<username>\.cache\pre-commit\repowg97az4b\py_env-python3\Lib\site-packages\bandit\core\manager.py", line 197, in output_results
    raise RuntimeError(
RuntimeError: Unable to output report using 'txt' formatter: 'charmap' codec can't encode character '\U0001f44f' in position 135: character maps to <undefined>

Expected behavior

I expect Bandit to successfully create the report, either by handing unicode encodings, or by removing the problematic unicode characters.

Bandit version

1.7.5 (Default)

Python version

3.11 (Default)

Dantos7 avatar Aug 02 '23 15:08 Dantos7

Just had the same issue.

ethantenison avatar Mar 09 '24 18:03 ethantenison

I'm not able to reproduce on macOS. What operating system are you using? What is the value of your TERM environment variable?

Erics-MacBook-Pro-2:examples ericwb$ pre-commit run --all-files
bandit...................................................................Failed
- hook id: bandit
- exit code: 1

[main]	INFO	profile include tests: None
[main]	INFO	profile exclude tests: None
[main]	INFO	cli include tests: None
[main]	INFO	cli exclude tests: None
[main]	INFO	running on Python 3.12.2
Run started:2024-03-09 20:40:09.524162

Test results:
>> Issue: [B324:hashlib] Use of weak MD5 hash for security. Consider usedforsecurity=False
   Severity: High   Confidence: High
   CWE: CWE-327 (https://cwe.mitre.org/data/definitions/327.html)
   More Info: https://bandit.readthedocs.io/en/0.0.0/plugins/b324_hashlib.html
   Location: ./python/stdlib/hashlib_md5.py:4:0
3	
4	hashlib.md5()

--------------------------------------------------
>> Issue: [B105:hardcoded_password_string] Possible hardcoded password: 'Don't👏hard👏code👏secrets'
   Severity: Low   Confidence: Medium
   CWE: CWE-259 (https://cwe.mitre.org/data/definitions/259.html)
   More Info: https://bandit.readthedocs.io/en/0.0.0/plugins/b105_hardcoded_password_string.html
   Location: ./test.py:1:9
1	secret = u'Don\'t👏hard👏code👏secrets'

--------------------------------------------------

Code scanned:
	Total lines of code: 12
	Total lines skipped (#nosec): 0

Run metrics:
	Total issues (by severity):
		Undefined: 0
		Low: 1
		Medium: 0
		High: 1
	Total issues (by confidence):
		Undefined: 0
		Low: 0
		Medium: 1
		High: 1
Files skipped (0):

ericwb avatar Mar 09 '24 20:03 ericwb

Hi, thank you for picking up the issue.

I am using Windows. I am able to reproduce the issue on Powershell 7 (7.4.1), Command Prompt and Git Bash (TERM = xterm-256color). Issue is occurring both in 1.7.5 and the latest version (1.7.8).

Dantos7 avatar Mar 24 '24 09:03 Dantos7