codyze icon indicating copy to clipboard operation
codyze copied to clipboard

Line numbers don't match files when Codyze CLI is executed with a folder (-s parameter)

Open agigleux opened this issue 4 years ago • 2 comments

Hello,

I extracted the test cases corresponding to JCA in this repo so it's easier for me to test and load the results into SonarQube/SonarCloud:

When I run Codyze with this command line I'm getting results for AESCBC.java (findings-AESCBC.json.txt) ~/Softwares/codyze-1.4.1/bin/codyze -c -s=src/main/java/jca/AESCBC.java -m=/home/alex/Softwares/codyze-1.4.1/mark/bouncycastle/ --no-good-findings

When I run this command line looking at all the Java files under the directory src/main/java/jca/, I'm getting different results for AESCBC.java (findings-all.json.txt).

~/Softwares/codyze-1.4.1/bin/codyze -c -s=src/main/java/jca/ -m=/home/alex/Softwares/codyze-1.4.1/mark/bouncycastle/ --no-good-findings

I'm getting 21 problems when I target only AESCBC.java, while I'm getting only 13 problems for AESCBC.java when I target the entire folder.

I would expect to see the same quantity of problems because the files under the directory src/main/java/jca/ have no relationship.

The second problem is the inconsistency of the line numbers when targeting a folder. For example, there is a problem raised on AESGMAC.java on line 17 (so 18 for real), while there is not 66 characters on this line but only 44:

  "locations": [
    {
      "region": {
        "endLine": 17,
        "endColumn": 66,
        "startColumn": 9,
        "startLine": 17
      },
      "artifactLocation": {"uri": "file:/home/alex/Repos/Java_Validation/codyze-java-testcases/src/main/java/jca/AESGMAC.java"}
    },

image

agigleux avatar Nov 30 '20 14:11 agigleux

Hi! Thanks for reporting this. This is indeed a very strange bug reg. scanning file vs. folder

The line number inconsistency looks to be the the same as #106. My guess that this arises out of the fact that we use both SARIF and lsp code regions / locations in codyze and the underlying code property graph library. I guess that converting between those two goes wrong at some point and the line width of the wrong line is used (which I guess in 66 in case of line 17).

oxisto avatar Nov 30 '20 22:11 oxisto

Hi @agigleux,

Thank you for the issue. I have verified that the analysis produces different results when scanning a single file vs. scanning a folder. With the given files and don't see a reason for this behavior and the findings should in fact be very similar.

My hope is that I find the problem with the lines/columns on the way as well.

I need to investigate further. For now, I've created a WIP PR (#137) to track the progress.

fwendland avatar Jan 14 '21 14:01 fwendland