codeql-cli-binaries analyzing JS code with annotations

I've been trying to analyze some code from the lumo project. Some of the code contains annotations, see e.g.

https://github.com/anmonteiro/lumo/blob/master/src/js/util.js

which contains code like:

export function expandPath(somePath: string): string { const tildeExpandedPath = somePath.startsWith('~') ? somePath.replace(/^~/, os.homedir()) : somePath; return path.resolve(tildeExpandedPath); }

I have no problem running queries against this project, but when I try to create a test that analyze some code fragments from this project, the extractor fails with a fatal error:

Could not extract a dataset in /Users/franktip/git/ApproximateCallGraphAnalysis/tests/testLumo: Extraction command /Users/franktip/codeql-home/codeql/tools/osx64/java/bin/java failed with status 1 Extraction command /Users/franktip/codeql-home/codeql/tools/osx64/java/bin/java failed with status 1 [1/1] FAILED(EXTRACTION) /Users/franktip/git/ApproximateCallGraphAnalysis/tests/testLumo/reachable.qlref 0 tests passed; 1 tests failed: FAILED: /Users/franktip/git/ApproximateCallGraphAnalysis/tests/testLumo/reachable.qlref

I have a few questions:

is there any way to inform the codeql test command that we're analyzing code with annotations?
can a better error message be produced?
strangely, the error goes away if I change the file type to ".ts" instead of ".js" (note though that the original project uses the .js extension)

Jun 03 '20 18:06 franktip

Pinging @github/codeql-javascript.

Jun 03 '20 19:06 hmakholm

is there any way to inform the codeql test command that we're analyzing code with annotations?

Yes, add a file named options containing:

semmle-extractor-options: --experimental

There's an unfortunate discrepancy between how codeql test and codeql database create builds databases for JavaScript -- the former uses a legacy extractor interface which we're in the process of phasing out.

can a better error message be produced?

@hmakholm shouldn't codeql test forward the error from the extractor? I'm getting this error with qltest:

[2020-06-03 21:57:13] [ERROR] Spawned process exited abnormally (code 1; tried to run: [HOME/target/intree/standard/tools/macos/jdk-extractor-java/bin/java, -jar, HOME/target/intree/standard/tools/extractor-javascript.jar, --quiet, --abort-on-parse-errors, --typescript, regression-test1.js])
[1/1] Extraction failed in HOME/ql/javascript/ql/test/library-tests/Regression (extraction: 614ms)
Unexpected token: 8:35
Invocation failed (exit code 1): HOME/target/intree/standard/tools/macos/jdk-extractor-java/bin/java -jar HOME/target/intree/standard/tools/extractor-javascript.jar --quiet --abort-on-parse-errors --typescript regression-test1.js

strangely, the error goes away if I change the file type to ".ts" instead of ".js" (note though that the original project uses the .js extension)

They use Flow syntax, which is different from TypeScript and doesn't have its own extension.

Jun 03 '20 21:06 asgerf

shouldn't codeql test forward the error from the extractor?

It should, but unfortunately that's not implemented yet. The internal issue is https://github.com/github/codeql-coreql-team/issues/333 -- which might receive greater priority now that there's an external complaint about it :-)

Jun 03 '20 21:06 hmakholm

Thanks -- creating the options file works for me.

I have a follow-up question. First, some context: I created a test that contains some flow annotations. Then, in the directory containing the test, I ran "npm i flow-remove-types -SD" to install the annotation-remover --- this has the effect of installing many packages in a local node_modules subdirectory. Once that's installed, I can run "npm run flow:build" to strip away the annotations from the code, and the resulting stripped project is placed in a newly created subdirectory "lib". I can then run the JS code that is in this lib directory.

Now, when I run my query, I have two problems:

the analysis becomes extremely slow, presumably because the extractor is finding all the code that was installed in the node_modules directory. However, this code is not relevant to my own project -- it's only there for the flow tools.
if I remove the node_modules directory before running my analysis, my query runs quickly, but now I'm getting two sets of results: one for the annotated code in the "src" directory, and one for the stripped code in the lib directory.

So my question is: is there any way to inform "codeql test run" that it should only extract the code in the "src" directory?

Jun 04 '20 17:06 franktip

Try appending --exclude lib --exclude node_modules to the line in the options file:

semmle-extractor-options: --experimental --exclude lib --exclude node_modules

Jun 04 '20 18:06 asgerf

Thanks! Are these options documented anywhere? (I was looking, but perhaps not in the right place)

Jun 04 '20 18:06 franktip

Hmm, the --exclude lib --exclude node_modules options doesn't seem to make any difference. The query is still extremely slow..

Jun 04 '20 18:06 franktip

Hm, try ./node_modules/** and ./lib/** instead. Sorry, as I mention this extractor interface is being phased out and I don't think we have proper documentation for it. The best reference is this file.

Jun 05 '20 09:06 asgerf

Hi Asger, unfortunately this does not seem to work for me either. Any other suggestions?

Jun 05 '20 18:06 franktip

So, looking into this a bit, the reason that @asgerf's suggestion does not work is that (for historical reasons) codeql test extracts each .js file on its own (which is also one of the reasons it is slow), and when a file is explicitly passed to the extractor, --exclude flags are ignored.

There is a somewhat silly workaround: create a trivial tsconfig.json file containing only {}. I'm not going to explain why that works, but if you do that and then create an options file looking like this

semmle-extractor-options: --experimental --exclude lib/* --exclude node_modules/*

then lib and node_modules should be excluded.

Jun 11 '20 14:06 max-schaefer

Thanks, Max! I confirm that this works for me.

-Frank

Jun 11 '20 19:06 franktip

codeql-cli-binaries codeql-cli-binaries copied to clipboard

analyzing JS code with annotations

codeql-cli-binaries
codeql-cli-binaries copied to clipboard