bandit
bandit copied to clipboard
Improve handling nosec for multi-line strings
This commit improves handling nosecs in multi-line strings, like:
1. nosec_not_working = f"""
2. SELECT * FROM {table}
3. """ # nosec
Before this change, bandit was checking if there is
a nosec in line 1. Now, in the case of ast Constants
it searches for nosec in the last line of the expression.
This commit also moves detecting nosec without test number
to test phase from "pre-visit" phase.
It may increase the time of performing checks but avoids
counting the same nosec mark multiple times.
"pre-visit" phase is run separately for each part of multi-line
string split by FormattedValue
items. Thus for the above example,
it would be run twice, the first time for "\n SELECT * FROM "
and the second time for "\n"
making the nosec being counted twice.
Resolves: #880
So I think using the last line of linerange will fail for Python 3.7. Currently Bandit incorrectly reports the last line number for Py37. See issue #820
Thank you for taking a look at my changes.
I'm not sure if I understand. Do you mean that my changes are not going to work in Python 3.7?
get_nosec
function returns context['lineno']
for strings in Python 3.7. Why? Because in the case of Python 3.7 strings are parsed to ast.Str
. In the case of Python 3.10 they are parsed to ast.Constant
. It looks like ast.Constant
is used from 3.8 according to https://docs.python.org/3/library/ast.html:
Changed in version 3.8: Class ast.Constant is now used for all constants.
I did the following test:
file: string_examples.py
a = "string 1"
b = f"f-string {a}"
c = """multi-line
string
1"""
d = f"""multi-line
f-string
{a}"""
file: ast_example.py
import ast
with open('string_examples.py', 'r') as f:
ast_tree = ast.parse(f.read())
for node in ast_tree.body:
print(ast.dump(node))
And the output is:
- python 3.7:
Assign(targets=[Name(id='a', ctx=Store())], value=Str(s='string 1'))
Assign(targets=[Name(id='b', ctx=Store())], value=JoinedStr(values=[Str(s='f-string '), FormattedValue(value=Name(id='a', ctx=Load()), conversion=-1, format_spec=None)]))
Assign(targets=[Name(id='c', ctx=Store())], value=Str(s='multi-line\nstring\n1'))
Assign(targets=[Name(id='d', ctx=Store())], value=JoinedStr(values=[Str(s='multi-line\nf-string\n'), FormattedValue(value=Name(id='a', ctx=Load()), conversion=-1, format_spec=None)]))
- python 3.8:
Assign(targets=[Name(id='a', ctx=Store())], value=Constant(value='string 1', kind=None), type_comment=None)
Assign(targets=[Name(id='b', ctx=Store())], value=JoinedStr(values=[Constant(value='f-string ', kind=None), FormattedValue(value=Name(id='a', ctx=Load()), conversion=-1, format_spec=None)]), type_comment=None)
Assign(targets=[Name(id='c', ctx=Store())], value=Constant(value='multi-line\nstring\n1', kind=None), type_comment=None)
Assign(targets=[Name(id='d', ctx=Store())], value=JoinedStr(values=[Constant(value='multi-line\nf-string\n', kind=None), FormattedValue(value=Name(id='a', ctx=Load()), conversion=-1, format_spec=None)]), type_comment=None)
- python 3.10:
Assign(targets=[Name(id='a', ctx=Store())], value=Constant(value='string 1'))
Assign(targets=[Name(id='b', ctx=Store())], value=JoinedStr(values=[Constant(value='f-string '), FormattedValue(value=Name(id='a', ctx=Load()), conversion=-1)]))
Assign(targets=[Name(id='c', ctx=Store())], value=Constant(value='multi-line\nstring\n1'))
Assign(targets=[Name(id='d', ctx=Store())], value=JoinedStr(values=[Constant(value='multi-line\nf-string\n'), FormattedValue(value=Name(id='a', ctx=Load()), conversion=-1)]))
Thank you for working on this. This also fixes #658 reported in December 2020.
Hi @ericwb , would you find a while to take a look at this pull request? If you think it should be fixed in another way, do not hesitate to say so. I'm willing to spend some more time on this if needed.
Hello guys, can we expect this PR to be merged some time soon?
I added tests for implicit concatenation cases and it turned out that my solution didn't work. I changed the approach and now I search for nosec
in every line reported in linerange
property. I also abandoned using lineno
property and always use linerange
so the code became a bit simpler.
nosec
can be now placed at any line of "implicit concatenation" expression.
Except python 3.7. Indeed, as @ericwb said, we do not have the last line of the analyzed expression thus in the case of python 3.7 nosec
has to be at the first line of the "implicit concatenation" expression. And to be honest, I have no idea on how to fix this.
In the case of a multiline string, as shown below, nosec
is detected when it is placed at the last line:
f"""
SELECT *
FROM foo
WHERE id = {identifier}
""" # nosec
independently from the python version.
@sigmavirus24 Thank you!
Not sure but this change appears to have broken our Python 3.7 runs. We had tests getting skipped and now ~none~ a portion of them are. Still investigating root cause.
@JRemitz Please let me know if you have examples that should be skipped and are not.
Definitely and I think I'm eating my words... we had a reduction in "skipped" tests, but the new ones were not previously skipped, they were two of the new rules that were added. I am able to correlate those, I just need to dig up why the skipped tests count dropped. I'll let you know if that has any relationship to this change. Apologies for the noise!