[Feature] Evaluate whether to extend CI with static analysis
π Feature
More specifically, tools which were suggested to us by auditors include:
- Semgrep
- Dylint
- cargo-edit
- cargo hack
- Zizmor
If we're happy with any of these, we can also extend to snarkOS
So I spent some time on checking semgrep and it is a fast static analysis tool that scans source code and configuration files using simple rule-based patterns, regexes mainly. Unlike compiler-level tools, it works purely on the structure of files, which makes it extremely quick (around 7-8 seconds for the entire SnarkVM repository in the moment, but I'll work on the specific rules that make sense for an example PR).
It is good for enforcing project-specific conventions.
It adds the following:
Additional safety layer on top of Clippy and tests. Semgrep catches patterns that the Rust compiler and Clippy donβt cover - such as forbidden unwrap(), panic!, println!, unsafe patterns, and other conventions we choose to enforce. I will add examples on how to use these in the example PR; For this project, as we use panic!, todo!, etc I will add just an idea so we can, for example remove them in the future.
It doesnβt just analyze Rust code: Semgrep also inspects YAML, TOML, Markdown, JSON, and shell scripts, which allows us to detect issues in CI configs, Cargo metadata, documentation, and helper scripts.
Semgrep reliably flags leftover <<<<<<< HEAD / >>>>>>> branch markers in all supported file types, preventing accidental commits of broken files. We had commits with problems like that before. In source files the CI breaks but in some markdown or config files it may pass and break runtime (for the JSON, toml, yaml).
It allows custom repository-specific rules, I've been checking good practice-rules for rust and will add for yml, shell scripts, , json and markdown files. I did a find . -type f -not -path "./target/*" -name '*.*' | sed -E 's/.*\.([^.]+)$/\1/' | tr 'A-Z' 'a-z' | sort | uniq -c | sort -nr and these file types make sense for this repository.
I think it won't make the CI slower and heavier and only has pros, so will make a PR with meaningful rules for the file types I mentioned.
Dylint
Dylint is a tool for running custom Clippy-like lints without needing to modify the Rust compiler or Clippy itself. Itβs essentially a framework that lets you write, test, and run your own lint crates that behave exactly like Clippy lints β but distributed as normal Rust crates and dynamically loaded at runtime.
Clippy + dynamic lint loading + ability to write project-specific lint rules.
With the CI in mind Dylint can be used for:
- Enforcing project-specific rules (basically what semgrep could do) and making PRs fail when code violates internal rules:
- Disallowing unwrap() in production code.
- Ensuring certain APIs (tokio::spawn) are used in approved wrappers.
- Linting for misuse of unsafe code or cryptographic primitives in your project.
- Avoiding maintaining a Clippy fork : basically because it allows dynamically writing custom clippy rules.
What Dylint CAN NOT help with in CI:
- Speed up builds
- Speed up tests
- Reduce compilation
Basically this is the more "rusty" way of achieving the special rust-specific checks I did in the semgrep PR. And of course as it is rust-specific we can write more special checks.
From your comment on the semgrep PR, I think we can leave Dylint and we don't need it for the CI.
cargo-edit
Cargo edit mainly extends the cargo CLI. It adds commands like:
cargo add <crate>
cargo rm <crate>
cargo upgrade (update dependencies)
cargo set-version
It modifies the Cargo.toml through commands, that's its power. It can not speed up builds, tests, etc.
When cargo-edit is used in CI
- Automation for dependency upgrade bots : If dependencies should be updated in given time intervals, maybe with some cron CI job. I don't think this is working for SnarkVM.
- Publishing automation : It comes with
cargo set-versionand we can automate publishing the crate, but this is not for our case too. - Repo tooling : For example Talisker could use it to rewrite dependencies when compiling, the CI too can do something like that, but again far fetched.
So I really don't see much use related to the SnarkVM CI.
cargo hack
This one can be used to test multiple feature combinations, which is interesting. I've been playing with it and ran:
cargo hack check \
--workspace \
--each-feature \
--no-dev-deps \
--ignore-private \
--exclude-features "heavy,experimental" \
--exclude-no-default-features
It basically tests feature combinations. For SnarkVM it actually ends with an error
error[E0425]: cannot find function `max_available_threads` in module `snarkvm_utilities::parallel`
--> fields/src/lib.rs:82:59
|
82 | let num_cpus_available = snarkvm_utilities::parallel::max_available_threads();
| ^^^^^^^^^^^^^^^^^^^^^ not found in `snarkvm_utilities::parallel`
|
note: found an item that was configured out
--> /Users/meddle/development/rust/snarkVM/utilities/src/parallel.rs:56:8
|
56 | pub fn max_available_threads() -> usize {
| ^^^^^^^^^^^^^^^^^^^^^
note: the item is gated here
--> /Users/meddle/development/rust/snarkVM/utilities/src/parallel.rs:55:1
|
55 | #[cfg(not(feature = "serial"))]
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
For more information about this error, try `rustc --explain E0425`.
error: could not compile `snarkvm-fields` (lib) due to 1 previous error
error: process didn't exit successfully: `/Users/meddle/.rustup/toolchains/1.88-aarch64-apple-darwin/bin/cargo check --manifest-path /Users/meddle/development/rust/snarkVM/console/types/Cargo.toml --no-default-features --features serial` (exit status: 101)
Maybe my knowledge of the SnarkVM repo is shallow, so this is normal here.
Another usage is to run test including and excluding tests by feature:
cargo hack test --each-feature --exclude-features heavy_feature
But we do that manually anyway, so don't think it is a big improvement. Though for the above check something like that can be added to every workflow:
jobs:
....
feature-matrix:
executor: rust-docker
resource_class: << pipeline.parameters.medium >>
steps:
- checkout
- restore_cache:
keys:
- cargo-cache-{{ arch }}-{{ checksum "Cargo.lock" }}
- cargo-cache-{{ arch }}-
- run:
name: Install cargo-hack
command: |
set -euo pipefail
cargo install cargo-hack --version 0.6.3
- run:
name: Check all features per crate
command: |
set -euo pipefail
export CARGO_TERM_COLOR=never
# Fast, PR-safe feature matrix:
# * runs on all workspace crates
# * checks each feature individually
# * ignores private (publish = false) crates
# * strips dev-deps
cargo hack check \
--workspace \
--each-feature \
--no-dev-deps \
--ignore-private \
--exclude-features "heavy,experimental" \
--exclude-no-default-features
- save_cache:
key: cargo-cache-{{ arch }}-{{ checksum "Cargo.lock" }}
paths:
- /usr/local/cargo/registry
- /usr/local/cargo/git
- target
So this one maybe interesting to dig deeper?
Additional thoughts from Chat GPT on this tool:
Should you use cargo-hack? Yes if either of these is true:
A) Your libraries expose many optional features B) You want to avoid breaking feature combinations for downstream users C) You want to detect CI-only or user-only feature regressions
cargo-hack is the only reliable way to do this.
Zizmor
This is a tool for analysing github actions for possible problems with them. It doesn't seem like a thing you run in the CI, more something tat could be more useful in a commit hook.
For example when I ran it locally I got:
zizmor .
π zizmor v1.18.0
INFO audit: zizmor: π completed ./.github/workflows/benchmarks.yml
INFO audit: zizmor: π completed ./.github/workflows/sync-snarkvm-documentation.yml
INFO audit: zizmor: π completed ./.github/workflows/sync-snarkvm-to-welcome-documentation.yml
warning[artipacked]: credential persistence through GitHub Actions artifacts
--> ./.github/workflows/benchmarks.yml:17:9
|
17 | - name: Checkout
| _________^
18 | | uses: actions/checkout@v5
| |_________________________________^ does not set persist-credentials: false
|
= note: audit confidence β Low
= note: this finding has an auto-fix
error[unpinned-uses]: unpinned action reference
--> ./.github/workflows/benchmarks.yml:21:9
|
21 | uses: dtolnay/rust-toolchain@stable
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ action is not pinned to a hash (required by blanket policy)
|
= note: audit confidence β High
error[unpinned-uses]: unpinned action reference
--> ./.github/workflows/benchmarks.yml:112:9
|
112 | uses: benchmark-action/github-action-benchmark@v1
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ action is not pinned to a hash (required by blanket policy)
|
= note: audit confidence β High
warning[artipacked]: credential persistence through GitHub Actions artifacts
--> ./.github/workflows/sync-snarkvm-documentation.yml:15:9
|
15 | - uses: actions/checkout@v5
| ^^^^^^^^^^^^^^^^^^^^^^^^^ does not set persist-credentials: false
|
= note: audit confidence β Low
= note: this finding has an auto-fix
warning[excessive-permissions]: overly broad permissions
--> ./.github/workflows/sync-snarkvm-documentation.yml:12:3
|
12 | / sync:
13 | | runs-on: ubuntu-latest
14 | | steps:
15 | | - uses: actions/checkout@v5
... |
25 | | -H "Accept: application/vnd.github.v3+json" https://api.github.com/repos/${{ secrets.SYNC_ORGANIZATION }}/${{ secrets....
26 | | -d '{"event_type":"sync-snarkvm-documentation", "client_payload": { "branch": "'"${GITHUB_REF#refs/heads/}"'" }}'
| | ^
| | |
| |______________________________________________________________________________________________________________________________this job
| default permissions used due to no permissions: block
|
= note: audit confidence β Medium
warning[artipacked]: credential persistence through GitHub Actions artifacts
--> ./.github/workflows/sync-snarkvm-to-welcome-documentation.yml:15:9
|
15 | - uses: actions/checkout@v5
| ^^^^^^^^^^^^^^^^^^^^^^^^^ does not set persist-credentials: false
|
= note: audit confidence β Low
= note: this finding has an auto-fix
warning[excessive-permissions]: overly broad permissions
--> ./.github/workflows/sync-snarkvm-to-welcome-documentation.yml:12:3
|
12 | / sync:
13 | | runs-on: ubuntu-latest
14 | | steps:
15 | | - uses: actions/checkout@v5
... |
25 | | -H "Accept: application/vnd.github.v3+json" https://api.github.com/repos/${{ secrets.SYNC_ORGANIZATION }}/${{ secrets.SYNC_APPLI...
26 | | -d '{"event_type":"sync-snarkvm-to-welcome-documentation", "client_payload": { "branch": "'"${GITHUB_REF#refs/heads/}"'" }}'
| | ^
| | |
| |_________________________________________________________________________________________________________________________________________this job
| default permissions used due to no permissions: block
|
= note: audit confidence β Medium
24 findings (17 suppressed, 3 fixable): 0 informational, 0 low, 5 medium, 2 high
So it checks github actions for possible problems. A linter for github actions.
So maybe from all these tools some worth we can get from zizmor as a commit hook check and we fix its warnings and from cargo-hack to force checks on how feature combinations can break each other in the CI.
Zizmor sounds great.
Checking feature combinations sounds potentially interesting. However, figuring out for which feature combinations it makes sense requires a lot of thinking and is low priority. And maybe it is possible in a limited manual way using just 'cargo clippy'? So let's defer.
OK, so I am leaving my findings in this ticket for now and we can do PRs later if we decide to use something - maybe zizmor (we need to fix the errors from it first, or come with a smart configuration as you said).