codeql-cli-binaries
codeql-cli-binaries copied to clipboard
analyzing JS code with annotations
I've been trying to analyze some code from the lumo project. Some of the code contains annotations, see e.g.
https://github.com/anmonteiro/lumo/blob/master/src/js/util.js
which contains code like:
export function expandPath(somePath: string): string { const tildeExpandedPath = somePath.startsWith('~') ? somePath.replace(/^~/, os.homedir()) : somePath; return path.resolve(tildeExpandedPath); }
I have no problem running queries against this project, but when I try to create a test that analyze some code fragments from this project, the extractor fails with a fatal error:
Could not extract a dataset in /Users/franktip/git/ApproximateCallGraphAnalysis/tests/testLumo: Extraction command /Users/franktip/codeql-home/codeql/tools/osx64/java/bin/java failed with status 1 Extraction command /Users/franktip/codeql-home/codeql/tools/osx64/java/bin/java failed with status 1 [1/1] FAILED(EXTRACTION) /Users/franktip/git/ApproximateCallGraphAnalysis/tests/testLumo/reachable.qlref 0 tests passed; 1 tests failed: FAILED: /Users/franktip/git/ApproximateCallGraphAnalysis/tests/testLumo/reachable.qlref
I have a few questions:
- is there any way to inform the codeql test command that we're analyzing code with annotations?
- can a better error message be produced?
- strangely, the error goes away if I change the file type to ".ts" instead of ".js" (note though that the original project uses the .js extension)
Pinging @github/codeql-javascript.
is there any way to inform the codeql test command that we're analyzing code with annotations?
Yes, add a file named options
containing:
semmle-extractor-options: --experimental
There's an unfortunate discrepancy between how codeql test
and codeql database create
builds databases for JavaScript -- the former uses a legacy extractor interface which we're in the process of phasing out.
can a better error message be produced?
@hmakholm shouldn't codeql test
forward the error from the extractor? I'm getting this error with qltest
:
[2020-06-03 21:57:13] [ERROR] Spawned process exited abnormally (code 1; tried to run: [HOME/target/intree/standard/tools/macos/jdk-extractor-java/bin/java, -jar, HOME/target/intree/standard/tools/extractor-javascript.jar, --quiet, --abort-on-parse-errors, --typescript, regression-test1.js])
[1/1] Extraction failed in HOME/ql/javascript/ql/test/library-tests/Regression (extraction: 614ms)
Unexpected token: 8:35
Invocation failed (exit code 1): HOME/target/intree/standard/tools/macos/jdk-extractor-java/bin/java -jar HOME/target/intree/standard/tools/extractor-javascript.jar --quiet --abort-on-parse-errors --typescript regression-test1.js
strangely, the error goes away if I change the file type to ".ts" instead of ".js" (note though that the original project uses the .js extension)
They use Flow syntax, which is different from TypeScript and doesn't have its own extension.
shouldn't codeql test forward the error from the extractor?
It should, but unfortunately that's not implemented yet. The internal issue is https://github.com/github/codeql-coreql-team/issues/333 -- which might receive greater priority now that there's an external complaint about it :-)
Thanks -- creating the options file works for me.
I have a follow-up question. First, some context: I created a test that contains some flow annotations. Then, in the directory containing the test, I ran "npm i flow-remove-types -SD" to install the annotation-remover --- this has the effect of installing many packages in a local node_modules subdirectory. Once that's installed, I can run "npm run flow:build" to strip away the annotations from the code, and the resulting stripped project is placed in a newly created subdirectory "lib". I can then run the JS code that is in this lib directory.
Now, when I run my query, I have two problems:
- the analysis becomes extremely slow, presumably because the extractor is finding all the code that was installed in the node_modules directory. However, this code is not relevant to my own project -- it's only there for the flow tools.
- if I remove the node_modules directory before running my analysis, my query runs quickly, but now I'm getting two sets of results: one for the annotated code in the "src" directory, and one for the stripped code in the lib directory.
So my question is: is there any way to inform "codeql test run" that it should only extract the code in the "src" directory?
Try appending --exclude lib --exclude node_modules
to the line in the options
file:
semmle-extractor-options: --experimental --exclude lib --exclude node_modules
Thanks! Are these options documented anywhere? (I was looking, but perhaps not in the right place)
Hmm, the --exclude lib --exclude node_modules options doesn't seem to make any difference. The query is still extremely slow..
Hm, try ./node_modules/**
and ./lib/**
instead. Sorry, as I mention this extractor interface is being phased out and I don't think we have proper documentation for it. The best reference is this file.
Hi Asger, unfortunately this does not seem to work for me either. Any other suggestions?
So, looking into this a bit, the reason that @asgerf's suggestion does not work is that (for historical reasons) codeql test
extracts each .js
file on its own (which is also one of the reasons it is slow), and when a file is explicitly passed to the extractor, --exclude
flags are ignored.
There is a somewhat silly workaround: create a trivial tsconfig.json
file containing only {}
. I'm not going to explain why that works, but if you do that and then create an options
file looking like this
semmle-extractor-options: --experimental --exclude lib/* --exclude node_modules/*
then lib
and node_modules
should be excluded.
Thanks, Max! I confirm that this works for me.
-Frank