cargo-deny icon indicating copy to clipboard operation
cargo-deny copied to clipboard

Exhaustive license searching

Open iliana opened this issue 5 years ago • 1 comments

Is your feature request related to a problem? Please describe. In Bottlerocket, we use cargo-deny for enforcing a license policy, as well as bottlerocket-license-scan to identify license files in vendored sources to copy into a final OS image.

bottlerocket-license-scan grew a clarification feature very much like deny.toml to handle situations where a license file doesn't scan as anything SPDX knows about (within reasonable confidence), or where a license is scanned that isn't part of the crate's license string. I believe this logic is similar to something we saw in cargo-deny v0.2, but maybe I'm misremembering.

We've seen a decent amount of -sys crates that vendor code and don't reflect the vendored license in the license field. Some examples are backtrace-sys, zstd-sys, and now that I'm writing a bottlerocket-license-scan clarify.toml for cargo-deny itself, libgit2-sys, which vendors libgit2 and transitively vendors some of libgit2's dependencies. It fortunately all looks permissive, but there's still a good amount of work for software archaeologists to pick apart.

Describe the solution you'd like I'd just like to clarify (heh) whether you intend to keep your documented approach or adjust it:

Note however, that cargo-deny does not (currently) exhaustively search the entirety of the source code of every crate to find every possible license that could be attributed to the crate, as there are a ton of edge cases to that approach.

Like you say, there are a ton of edge cases, and those edges are very sharp and pointy.

I'd like to actually go to all these upstreams and help them reflect their total licenses properly, and I think cargo-deny can help with that, but it might need to grow exhaustive license searching again on an opt-in basis to assist with that.

iliana avatar Mar 17 '20 19:03 iliana

My current plan for this is #121. clearlydefined.io could act as a supplement to local license checking, and allows anyone to submit curations to a central source of truth that (hopefully) eventually reach the actual upstream repo. But yes, we've noticed the exact same thing ourselves, basically every crate that links c/c++ code seems to completely ignore the license requirements of that code.

Jake-Shadle avatar Mar 17 '20 19:03 Jake-Shadle