codeql icon indicating copy to clipboard operation
codeql copied to clipboard

False Positive - Mismatch in multiple assignment

Open MatthewReid854 opened this issue 3 years ago • 8 comments

False positive for "Mismatch in multiple assignment" in Python.

A mismatch in multiple assignment can occur when the number of values assigned does not match the number of variables to which they are assigned. For example a,b=1,2,3 would cause this error in Python.

The false positive occurs when the number of variables is not directly assigned, but instead assigned to a function such as: out1,out2 = myfunction(in1,in2,in3) This is not direct assignment of variables but LGTM triggers a false positive on some occasions when this syntax is used.

URL to the alert on the project page on LGTM.com is available from here In this file, there are 18 alerts and they are all for the same style of false positive.

Another issue is that to suppress an alert the # lgtm [py/mismatched-multiple-assignment] comment must be on the same line as the offending code. If the offending code runs across multiple lines (as is common when using an auto formatter like Black) the suppression comment does not work.

MatthewReid854 avatar May 01 '22 03:05 MatthewReid854

It looks like you are getting this error because the function can return 4, 5, or 10 values depending on what value of func is passed in.. So, this doesn't look like. It looks like given the constants you have passed in, the expected number of return values matches your assignments, but the static analysis isn't able to determine this. I wouldn't even call this a false-positive since I'd think it's easy to misread the return values even for a person eye-balling the function.

My suggestion is either refactor the function so that it always returns the same number of parameters (even if some of them are empty or unused), or use LGTM facilities to disable this query.

aeisenberg avatar May 02 '22 15:05 aeisenberg

Thanks for your review of this @aeisenberg. I agree that it's difficult for LGTM to figure out whether the number of values returned match the number expected as there are a number of use cases for this function. I will proceed with error suppression as I'd rather not refactor the code if there is no error.

The issue I have with error suppression is that to suppress an alert the # lgtm [py/mismatched-multiple-assignment] comment must be on the same line as the offending code. If the offending code runs across multiple lines (as is common when using an auto formatter like Black) the suppression comment does not work. Is there a way around this that does not involve having a very long line of code?

MatthewReid854 avatar May 03 '22 11:05 MatthewReid854

I don't know the answer to this. I'll ask the team that maintains LGTM.

aeisenberg avatar May 04 '22 04:05 aeisenberg

@MatthewReid854 The suppression comment needs to be on the line where the alert starts, which is line 392 in this case.

Note that you don't have to include the query identifier ([py/mismatched-multiple-assignment]) a simple #lgtm also works.

You can also use #noqa instead (~~with or~~ without query identifier). It's equally long, but perhaps the auto formatter knows that these should not be moved to a next line

aibaars avatar May 04 '22 08:05 aibaars

Thanks @aibaars for the tip about the shorthand comment for alert suppression. Unfortunately this does nothing to resolve the problem of needing the query identifier comment on the same line that the offending code starts. Like many Python developers, I use Black to auto-format my code. It does not care what the comment is, if the line is greater than 80 characters it gets wrapped. It would be amazing if LGTM had the ability to check comments at the end of a line of code for query identifiers even if that line of code has been wrapped across multiple lines. Assuming the LGTM developer team is interested in addressing this deficiency, would you like me to raise this as a separate issue to this false positive?

MatthewReid854 avatar May 05 '22 13:05 MatthewReid854

The shorthand forms #lgtm and #noqa should hopefully be short enough to be left in place when Black auto-formats the code.

There is very little development on LGTM these days as the focus of the team is on CodeScanning these days.

That said, it might still be useful to open an issue (or a pull request) to request improvements to https://github.com/github/codeql/blob/main/python/ql/src/analysis/AlertSuppression.ql

Trying to look for a query identifier on a next comment line may be a little tricky as it may be hard to distinguish a wrapped identifier from an unrelated comment that happens to be on the next line. An alternative would be to make an #lgtm comment that are at the start of a line apply to the code on the next line.

That way the following two examples would work the same:

# lgtm [py/mismatched-multiple-assignment]
a,b=1,2,3

and

a,b=1,2,3 # lgtm [py/mismatched-multiple-assignment]

I think this can be achieved by modifying https://github.com/github/codeql/blob/ca2959cf37179380560b12148347f007fd28c543/python/ql/src/analysis/AlertSuppression.ql#L43-L48

aibaars avatar May 05 '22 15:05 aibaars

There is very little development on LGTM these days as the focus of the team is on CodeScanning these days.

Does this mean that this false positive is unlikely to be fixed anytime soon? If so, how does that square with the invitation in the documentation to file issues for false positives?

If you believe that a particular alert is a false positive that shouldn't have been reported by LGTM, tell us! Suppressing the alert will no longer flag the alert during analysis, but it won't change the fact that the query is wrong and needs fixing. Let us know by creating an issue in our public CodeQL repository on GitHub. Don't forget to give us the URL for the full source code view of the alert. We will fine-tune the relevant query to remove false positives. This will improve the analysis for everyone.

I hope there is agreement that flagging

a, b = foo()

as a mismatch is a false positive if it can be proven that foo always returns a tuple of two elements but LGTM's static analysis doesn't go deep enough to determine that fact. It is fine for static analysis to be partial, so the results are always going to fall into three distinct categories: match, mismatch and unknown. In this case LGTM classifies the code fragment as a mismatch whereas the expected classification is unknown.

hannes-ucsc avatar Jul 19 '22 00:07 hannes-ucsc

There is very little development on LGTM these days as the focus of the team is on CodeScanning these days.

Does this mean that this false positive is unlikely to be fixed anytime soon? If so, how does that square with the invitation in the documentation to file issues for false positives?

@hannes-ucsc The comment above is related to the request to improve the suppression syntax for LGTM. Since the teams focus is not on LGTM this is unlikely to be changed. Both LGTM and CodeScanning use the CodeQL static analyser under the hood. CodeQL is under active development in this repository and the team greatly values false positive reports.

You're absolutely right that this is a false positive if foo always returns two elements.

aibaars avatar Jul 19 '22 09:07 aibaars