Compress release bundle with zstandard/zstd to reduce size
I propose a .zstd download option alongside the existing .gz one for Linux releases. For the latest 2.18.1 linux64 bundle, using zstd instead of gzip can cut off 33% of the file size, or 822.8 MiB down to 553.2 MiB.
Example command to convert the existing .gz:
zcat codeql-bundle-linux64.tar.gz | zstd --long=27 -9 -o codeql-bundle-linux64.tar.zstd
File sizes:
862823301 codeql-bundle-linux64.tar.gz
580124258 codeql-bundle-linux64.tar.zstd
For zstd arguments, compression levels above -9 saw diminishing returns, though -19 does get down to 504.5 MiB while taking 12x longer to compress. Using higher --long= values improves compression, but 27 is the highest value that clients can process by default, per https://github.com/facebook/zstd/blob/dev/programs/zstd.1.md?plain=1#L162
Compression with xz is also an improvement, it's just noticeably slower. Either is an improvement over just .gz and any recent linux will support both .zstd or .xz for decompression.
Thanks for your feedback. We'll take this into consideration.
This was implemented, but now failing on Github Enterprise because the base docker images running in the ARC doesn't include zstd in any of the tags, and the current v3 tag is pointing to a version that requires zstd.
So, codeQL Can't start be initialized in the default runner... I don't see any release of the runner with the zstd in https://github.com/actions/runner/blob/main/images/Dockerfile ...
That is unfortunate. Also, the .zst archives being created are much larger than necessary since the --long=27 flag was not used. For the most recent linux bundle I get a 25% smaller file:
curl -LO https://github.com/github/codeql-action/releases/download/codeql-bundle-v2.20.1/codeql-bundle-linux64.tar.zst
stat -c %s codeql-bundle-linux64.tar.zst
608400767
cat codeql-bundle-linux64.tar.zst | zstd -d | zstd --long=27 | wc -c
455054249
@marcellodesales : According to our engineers, the job should only download the .zst bundle when zstd exists on the path, falling back to tar if it doesn't. Can I please ask you to rerun the job in debug mode, and upload the log files if possible, for us to better debug the issue?
@DSmithVA : Thanks a lot for bringing this to our attention; we are currently testing this approach, and it does indeed look promising.
@hvitved I have posted the info below at https://github.com/github/codeql-action/issues/2705#issuecomment-2605344817 as well... All languages fail with the latest version...
🔧 Settings
- name: Initialize CodeQL
uses: github/codeql-action/[email protected]
with:
debug: true
languages: go
build-mode: "manual"
config-file: .github/codeql-config.yml
⌨ Logs
##[debug]Evaluating condition for step: 'Initialize CodeQL'
##[debug]Evaluating: success()
##[debug]Evaluating success:
##[debug]=> true
##[debug]Result: true
##[debug]Starting: Initialize CodeQL
##[debug]Register post job cleanup for action: github/codeql-action/init@v3.[2](https://git.company.com/seceng-devsecops-platform/company-ghas-k8s-operator/actions/runs/2678337/job/9853213#step:5:2)8.1
##[debug]Loading inputs
##[debug]Evaluating: secrets.ACCESS_TOKEN
##[debug]Evaluating Index:
##[debug]..Evaluating secrets:
##[debug]..=> Object
##[debug]..Evaluating String:
##[debug]..=> 'ACCESS_TOKEN'
##[debug]=> null
##[debug]Result: null
##[debug]Evaluating: github.token
##[debug]Evaluating Index:
##[debug]..Evaluating github:
##[debug]..=> Object
##[debug]..Evaluating String:
##[debug]..=> 'token'
##[debug]=> '***'
##[debug]Result: '***'
##[debug]Evaluating: toJson(matrix)
##[debug]Evaluating toJson:
##[debug]..Evaluating matrix:
##[debug]..=> null
##[debug]=> 'null'
##[debug]Result: 'null'
##[debug]Loading env
Run github/codeql-action/init@v[3](https://git.company.com/seceng-devsecops-platform/company-ghas-k8s-operator/actions/runs/2678337/job/9853213#step:5:3).28.1
Job run UUID is 0cde5708-9[4](https://git.company.com/seceng-devsecops-platform/company-ghas-k8s-operator/actions/runs/2678337/job/9853213#step:5:4)e4-46a6-80e2-deb7dfb9[5](https://git.company.com/seceng-devsecops-platform/company-ghas-k8s-operator/actions/runs/2678337/job/9853213#step:5:5)ff0.
##[debug]Running git command: git rev-parse HEAD
##[debug]Sending status report: {"action_name":"init","action_oid":"unknown","action_ref":"v3.28.1","action_started_at":"2025-01-21T17:23:55.998Z","action_version":"3.28.1","analysis_key":".github/workflows/codeql-golang.yml:analyze","commit_oid":"d8d1429bce[6](https://git.company.com/seceng-devsecops-platform/company-ghas-k8s-operator/actions/runs/2678337/job/9853213#step:5:6)6e76202d14b5cc22251b91dfaa91f","first_party_analysis":true,"job_name":"analyze","job_run_uuid":"0cde5[7](https://git.company.com/seceng-devsecops-platform/company-ghas-k8s-operator/actions/runs/2678337/job/9853213#step:5:7)08-94e4-46a6-80e2-deb7dfb95ff0","ref":"refs/pull/104/merge","runner_os":"Linux","started_at":"2025-01-21T17:23:55.99[8](https://git.company.com/seceng-devsecops-platform/company-ghas-k8s-operator/actions/runs/2678337/job/9853213#step:5:8)Z","status":"starting","steady_state_default_setup":false,"testing_environment":"","workflow_name":"codeQL-golang","workflow_run_attempt":2,"workflow_run_id":2678337,"actions_event_name":"pull_request","runner_available_disk_space_bytes":741[9](https://git.company.com/seceng-devsecops-platform/company-ghas-k8s-operator/actions/runs/2678337/job/9853213#step:5:9)637760,"runner_total_disk_space_bytes":8589934592,"matrix_vars":"null","runner_arch":"X64"}
::group::Setup CodeQL tools
Setup CodeQL tools
##[debug]Found tar.
##[debug]Could not find zstd: Error: Unable to locate executable file: zstd. Please verify either the file path exists or the file can be found within a directory specified by the PATH environment variable. Also check the file mode to verify the file is executable.
/usr/bin/tar --version
tar (GNU tar) 1.34
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by John Gilmore and Jay Fenlason.
Found gnu tar version 1.34.
##[debug]Attempting to obtain CodeQL tools. CLI version: 2.20.1, bundle tag name: codeql-bundle-v2.20.1, URL: unspecified.
##[debug]isExplicit: 2.20.1
##[debug]explicit? true
##[debug]checking cache: /home/runner/_work/_tool/CodeQL/2.20.1/x64
##[debug]not found
##[debug]Didn't find a version of the CodeQL tools in the toolcache with a version number exactly matching 2.20.1.
##[debug]Found the following versions of the CodeQL tools in the toolcache: [].
##[debug]Didn't find any versions of the CodeQL tools starting with 2.20.1 in the toolcache. Trying next fallback method.
##[debug]Computed a fallback toolcache version number of 2.20.1 for CodeQL version 2.20.1.
##[debug]isExplicit: 2.20.1
##[debug]explicit? true
##[debug]checking cache: /home/runner/_work/_tool/CodeQL/2.20.1/x64
##[debug]not found
Did not find CodeQL tools version 2.20.1 in the toolcache.
##[debug]Did not find any candidate pinned versions of the CodeQL tools in the toolcache.
Found CodeQL bundle in github/codeql-action on https://git.company.com with URL https://git.company.com/api/v3/repos/github/codeql-action/releases/assets/5565.
Using CodeQL CLI version 2.20.1 sourced from https://git.company.com/api/v3/repos/github/codeql-action/releases/assets/5565 .
##[debug]Providing an authorization token to download CodeQL tools.
##[debug]Not running against github.com. Disabling all toggleable features.
##[debug]Writing feature flags to /home/runner/_work/_temp/cached-feature-flags.json
##[debug]Feature 'extract_to_toolcache' undefined in API response.
##[debug]Feature extract_to_toolcache is disabled due to its default value.
Downloading CodeQL tools from https://git.company.com/api/v3/repos/github/codeql-action/releases/assets/5565 . This may take a while.
Streaming the extraction of the CodeQL bundle.
##[debug]Extracting to /home/runner/_work/_temp/c2146770-b178-4be5-9164-0a0e8345e244. Input stream has high water mark 4194304.
tar -x --zstd --warning=no-unknown-keyword --overwrite -f - -C /home/runner/_work/_temp/c2146770-b178-4be5-9164-0a0e8345e244
tar (grandchild): zstd: Cannot exec: No such file or directory
tar (grandchild): Error is not recoverable: exiting now
tar: Child died with signal 13
tar: Error is not recoverable: exiting now
##[debug]Cleaning up extraction destination directory.
##[debug]Cleaned up extraction destination directory.
Warning: Failed to download and extract CodeQL bundle using streaming with error: Error while downloading and extracting tar: Error: write EPIPE
Warning: Falling back to downloading the bundle before extracting.
##[debug]Cleaning up CodeQL bundle.
Warning: Failed to clean up CodeQL bundle: no files found matching /home/runner/_work/_temp/c2146770-b178-4be5-9164-0a0e8345e244.
##[debug]Downloading https://git.company.com/api/v3/repos/github/codeql-action/releases/assets/5565
##[debug]Destination /home/runner/_work/_temp/ca3b4527-1a21-43d9-8713-81909027bb0a
##[debug]set auth
##[debug]download complete
Finished downloading CodeQL bundle to /home/runner/_work/_temp/ca3b4527-1a21-43d9-8713-81909027bb0a (11.1s).
Extracting CodeQL bundle.
##[debug]Extracting to /home/runner/_work/_temp/c2146770-b178-4be5-9164-0a0e8345e244.
tar -x --zstd --warning=no-unknown-keyword --overwrite -f /home/runner/_work/_temp/ca3b4527-1a21-43d9-8713-81909027bb0a -C /home/runner/_work/_temp/c2146770-b178-4be5-9164-0a0e8345e244
tar (child): zstd: Cannot exec: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
##[debug]Cleaning up extraction destination directory.
##[debug]Cleaned up extraction destination directory.
##[debug]Cleaning up CodeQL bundle archive.
##[debug]Cleaned up CodeQL bundle archive.
Error: Unable to download and extract CodeQL CLI: Failed to run "tar -x --zstd --warning=no-unknown-keyword --overwrite -f /home/runner/_work/_temp/ca3b4527-1a21-43d9-8713-81909027bb0a -C /home/runner/_work/_temp/c2146770-b178-4be5-9164-0a0e8345e244". Exit code was 2 and last log line was: n/a. See the logs for more details.
Details: Error: Failed to run "tar -x --zstd --warning=no-unknown-keyword --overwrite -f /home/runner/_work/_temp/ca3b4527-1a21-43d9-8713-81909027bb0a -C /home/runner/_work/_temp/c2146770-b178-4be5-9164-0a0e8345e244". Exit code was 2 and last log line was: n/a. See the logs for more details.
at ChildProcess.<anonymous> (/home/runner/_work/_actions/github/codeql-action/v3.28.1/lib/tar.js:171:28)
at ChildProcess.emit (node:events:519:28)
at ChildProcess._handle.onexit (node:internal/child_process:294:12)
##[debug]Running git command: git rev-parse HEAD
##[debug]Sending status report: {"action_name":"init","action_oid":"unknown","action_ref":"v3.28.1","action_started_at":"2025-01-21T17:23:55.998Z","action_version":"3.28.1","analysis_key":".github/workflows/codeql-golang.yml:analyze","commit_oid":"d8d1429bce66e76202d14b5cc22251b91dfaa91f","first_party_analysis":true,"job_name":"analyze","job_run_uuid":"0cde5708-94e4-46a6-80e2-deb7dfb95ff0","ref":"refs/pull/[10](https://git.company.com/seceng-devsecops-platform/company-ghas-k8s-operator/actions/runs/2678337/job/9853213#step:5:10)4/merge","runner_os":"Linux","started_at":"2025-01-21T17:23:55.998Z","status":"aborted","steady_state_default_setup":false,"testing_environment":"","workflow_name":"codeQL-golang","workflow_run_attempt":2,"workflow_run_id":2678337,"actions_event_name":"pull_request","runner_available_disk_space_bytes":7419633664,"runner_total_disk_space_bytes":8589934592,"cause":"Unable to download and extract CodeQL CLI: Failed to run \"tar -x --zstd --warning=no-unknown-keyword --overwrite -f /home/runner/_work/_temp/ca3b4527-1a21-43d9-8713-81909027bb0a -C /home/runner/_work/_temp/c2146770-b178-4be5-9164-0a0e8345e244\". Exit code was 2 and last log line was: n/a. See the logs for more details.\n\nDetails: Error: Failed to run \"tar -x --zstd --warning=no-unknown-keyword --overwrite -f /home/runner/_work/_temp/ca3b4527-1a21-43d9-8713-81909027bb0a -C /home/runner/_work/_temp/c2146770-b178-4be5-9164-0a0e8345e244\". Exit code was 2 and last log line was: n/a. See the logs for more details.\n at ChildProcess.<anonymous> (/home/runner/_work/_actions/github/codeql-action/v3.28.1/lib/tar.js:171:28)\n at ChildProcess.emit (node:events:519:28)\n at ChildProcess._handle.onexit (node:internal/child_process:294:[12](https://git.company.com/seceng-devsecops-platform/company-ghas-k8s-operator/actions/runs/2678337/job/9853213#step:5:12))","exception":"Error: Unable to download and extract CodeQL CLI: Failed to run \"tar -x --zstd --warning=no-unknown-keyword --overwrite -f /home/runner/_work/_temp/ca3b4527-1a21-43d9-87[13](https://git.company.com/seceng-devsecops-platform/company-ghas-k8s-operator/actions/runs/2678337/job/9853213#step:5:13)-81909027bb0a -C /home/runner/_work/_temp/c2[14](https://git.company.com/seceng-devsecops-platform/company-ghas-k8s-operator/actions/runs/2678337/job/9853213#step:5:14)6770-b178-4be5-9164-0a0e8345e244\". Exit code was 2 and last log line was: n/a. See the logs for more details.\n\nDetails: Error: Failed to run \"tar -x --zstd --warning=no-unknown-keyword --overwrite -f /home/runner/_work/_temp/ca3b4527-1a21-43d9-8713-81909027bb0a -C /home/runner/_work/_temp/c2146770-b178-4be5-9164-0a0e8345e244\". Exit code was 2 and last log line was: n/a. See the logs for more details.\n at ChildProcess.<anonymous> (/home/runner/_work/_actions/github/codeql-action/v3.28.1/lib/tar.js:171:28)\n at ChildProcess.emit (node:events:519:28)\n at ChildProcess._handle.onexit (node:internal/child_process:294:12)\n at setupCodeQL (/home/runner/_work/_actions/github/codeql-action/v3.28.1/lib/codeql.js:[15](https://git.company.com/seceng-devsecops-platform/company-ghas-k8s-operator/actions/runs/2678337/job/9853213#step:5:15)0:15)\n at async initCodeQL (/home/runner/_work/_actions/github/codeql-action/v3.28.1/lib/init.js:55:97)\n at async run (/home/runner/_work/_actions/github/codeql-action/v3.28.1/lib/init-action.js:[17](https://git.company.com/seceng-devsecops-platform/company-ghas-k8s-operator/actions/runs/2678337/job/9853213#step:5:17)5:34)\n at async runWrapper (/home/runner/_work/_actions/github/codeql-action/v3.28.1/lib/init-action.js:436:9)","completed_at":"[20](https://git.company.com/seceng-devsecops-platform/company-ghas-k8s-operator/actions/runs/2678337/job/9853213#step:5:20)25-01-21T17:24:08.201Z","matrix_vars":"null","runner_arch":"X64"}
##[debug]Node Action run completed with exit code 1
##[debug]CODEQL_ACTION_FEATURE_MULTI_LANGUAGE='false'
##[debug]CODEQL_ACTION_FEATURE_SANDWICH='false'
##[debug]CODEQL_ACTION_FEATURE_SARIF_COMBINE='true'
##[debug]CODEQL_ACTION_FEATURE_WILL_UPLOAD='true'
##[debug]CODEQL_ACTION_VERSION='3.28.1'
##[debug]CODEQL_ACTION_WARNED_ABOUT_VERSION='true'
##[debug]JOB_RUN_UUID='0cde5708-94e4-46a6-80e2-deb7dfb95ff0'
##[debug]CODEQL_ACTION_INIT_HAS_RUN='true'
##[debug]CODEQL_ACTION_ANALYSIS_KEY='.github/workflows/codeql-golang.yml:analyze'
##[debug]CODEQL_WORKFLOW_STARTED_AT='2025-01-[21](https://git.company.com/seceng-devsecops-platform/company-ghas-k8s-operator/actions/runs/2678337/job/9853213#step:5:21)T17:23:55.998Z'
##[debug]CODEQL_ACTION_JOB_STATUS='JOB_STATUS_FAILURE'
##[debug]Save intra-action state persisted_inputs = [["INPUT_DEBUG","true"],["INPUT_LANGUAGES","go"],["INPUT_BUILD-MODE","manual"],["INPUT_CONFIG-FILE",".github/codeql-config.yml"],["INPUT_QUERIES","security-extended,security-and-quality"],["INPUT_EXTERNAL-REPOSITORY-TOKEN",""],["INPUT_TOOLS",""],["INPUT_TOKEN","***"],["INPUT_REGISTRIES",""],["INPUT_MATRIX","null"],["INPUT_DB-LOCATION",""],["INPUT_CONFIG",""],["INPUT_PACKS",""],["INPUT_SETUP-PYTHON-DEPENDENCIES",""],["INPUT_SOURCE-ROOT",""],["INPUT_RAM",""],["INPUT_THREADS",""],["INPUT_DEBUG-ARTIFACT-NAME",""],["INPUT_DEBUG-DATABASE-NAME",""],["INPUT_TRAP-CACHING",""],["INPUT_DEPENDENCY-CACHING",""]]
##[debug]Finishing: Initialize CodeQL
@henrymercer : Is the log above sufficient for you to debug? I notice the line
##[debug]Could not find zstd: Error: Unable to locate executable file: zstd. Please verify either the file path exists or the file can be found within a directory specified by the PATH environment variable. Also check the file mode to verify the file is executable.
which suggests that we should be detecting that zstd is not present?
Thanks for the debug logs @marcellodesales. https://github.com/github/codeql-action/pull/2710 should fix this issue. I'll let you know once this is available in a stable release — this should be ready by the end of the week.
@marcellodesales The fix is now released as part of v3.28.3. I've asked in the other thread whether you be able to verify the fix by updating to the latest version of the CodeQL Action.
@DSmithVA Thanks again for bringing this to our attention, CodeQL Bundle v2.20.4 will ship with a reduced bundle size.
I'll close this issue.
@henrymercer Thank you for providing it... I will verify this week and report back! I did create a PR in the runner project to get the base image with zstd for faster execution https://github.com/actions/runner/pull/3670
@henrymercer We still don't have the base runner with zstd as discussed before... https://github.com/actions/runner/pull/3670 That way, we will be getting very slow bootstrap of codeql....
$ docker run -ti remote-ghcr.docker.artifactory.viasat.com/actions/actions-runner:2.328.0 zstd
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: exec: "zstd": executable file not found in $PATH: unknown
Run 'docker run --help' for more information