codeql icon indicating copy to clipboard operation
codeql copied to clipboard

Extraction error with tsg-python

Open tvalenta opened this issue 5 months ago • 1 comments
trafficstars

CodeQL is unable to extract and parse a Python3 file with the following line:

match["something"] = somethingelse

The CodeQL errors just reportsa syntax error and, logs aren't helpful.

A parse error occurred while processing <file> and as a result this file could not be analyzed. Check the syntax of the file using the python -m py\_compile command and correct any invalid syntax.

I've found that tsg-python throws an error for the word match when used in this syntax, but succeeds if the variable is renamed as xmatch

Failure - match

node 1
  _kind: "SyntaxErrorNode"
  _location: [0, 5, 0, 18]
  source: "[\"something\"]"

Success - xmatch

xmatch["something"] = somethingelse

Bumping the version of tree-sitter in the Python extractor seems to fix this. https://github.com/github/codeql/blob/bd21a03fc347ae7aa46af6c5a12682786f018c2c/python/extractor/tsg-python/Cargo.toml#L13

I no longer get a syntax error when building with tree-sitter = "=0.20.10" which is the latest version that I could get to build the extractor. I'm not sure if the fix is this simple, and if tree-sitter-graph needs to also be bumped.

tvalenta avatar Jun 11 '25 19:06 tvalenta