🌱 Remediation cleanup
What kind of change does this PR introduce?
Followup PR to clean up some work done in #2168
- [X] PR title follows the guidelines defined in our pull request documentation
What is the current behavior?
The function signatures for TokenPermissions and PinningDependencies was different from the rest of the checks because it included the check request. This wasn't ideal as it should ideally operate just on raw data.
More remediation specific types/functions have been moved to the remediation package.
What is the new behavior (if this is a feature change)?**
All the remediation metadata is now part of the raw data structures for the checks that need it. The function signatures are now restored.
- [X] Tests for the changes have been added (for bug fixes/features)
Which issue(s) this PR fixes
Fixes #2152
Special notes for your reviewer
Does this PR introduce a user-facing change?
For user-facing changes, please add a concise, human-readable release note to
the release-note
(In particular, describe what changes users might need to make in their application as a result of this pull request.)
NONE
Codecov Report
Merging #2354 (20d40ae) into main (7f214bf) will increase coverage by
0.23%. The diff coverage is68.25%.
Additional details and impacted files
@@ Coverage Diff @@
## main #2354 +/- ##
==========================================
+ Coverage 40.55% 40.79% +0.23%
==========================================
Files 112 112
Lines 8822 8827 +5
==========================================
+ Hits 3578 3601 +23
+ Misses 4984 4967 -17
+ Partials 260 259 -1
Integration tests success for [20d40ae52f8e014819a9776777b88acec2e46a1b] (https://github.com/ossf/scorecard/actions/runs/3251877629)
I'm not sure remediation logic can live in raw since raw it non opinionated?
As part of structured results, I think it makes sense to have a dedicated remediation file. The plan is to have a config files for each "rule" to be able to scale, so the remediation code should be simple.
Wdut?
Do we think we should wait till we start the implementation of the structured results to update this code?
I'm not sure remediation logic can live in
rawsince raw it non opinionated?
I meant the Remediation metadata struct (which just has branch and repo name). That should be ok to live in raw ?
As part of structured results, I think it makes sense to have a dedicated remediation file. The plan is to have a config files for each "rule" to be able to scale, so the remediation code should be simple.
IMO each check should own its logic (including raw, remediation, rules etc.). Its easier to scale the number of checks that way.
Do we think we should wait till we start the implementation of the structured results to update this code?
Yes and no. IIUC, this PR is trying to address the fact that some files in raw package depend on CheckRequest instead of RepoClient only (it currently doesn't accomplish that and we should definitely fix that in this PR). But we can avoid adding more packages to the project until we have reached a consensus on the structured results design.
I'm not sure remediation logic can live in
rawsince raw it non opinionated?I meant the
Remediationmetadata struct (which just has branch and repo name). That should be ok to live inraw?
IMO, remediation has nothing to do with raw. raw is non-opinionated and has no concept of alerts / bugs, so should not be aware of remediations. I don't mind keeping it in raw temporarily, though. But long-term I don't think it should live in raw.
As part of structured results, I think it makes sense to have a dedicated remediation file. The plan is to have a config files for each "rule" to be able to scale, so the remediation code should be simple.
IMO each check should own its logic (including raw, remediation, rules etc.). Its easier to scale the number of checks that way.
With structured results, there will be no remediation-specific code to write in the checks. The checks will simply instantiate a remediation / rule object with the name of the rules. The remediation package will have the common logic which will be independent of the rules. The rules will be defined in a config file: we can decide to put them under the directory of each check (if we think this helps scale / maintenance) or in a rule directory shared across all checks.
Do we think we should wait till we start the implementation of the structured results to update this code?
Yes and no. IIUC, this PR is trying to address the fact that some files in
rawpackage depend onCheckRequestinstead ofRepoClientonly (it currently doesn't accomplish that and we should definitely fix that in this PR). But we can avoid adding more packages to the project until we have reached a consensus on the structured results design.
Some high-level comments:
->
RawResultshould be data only. So let's not add any non-serializable logic to that structure.
I assume you're talking about the SetupRemediationMetadata method defined on the token and pinning structs?
Yes and no. IIUC, this PR is trying to address the fact that some files in raw package depend on CheckRequest instead of RepoClient only (it currently doesn't accomplish that and we should definitely fix that in this PR). But we can avoid adding more packages to the project until we have reached a consensus on the structured results design.
Just to be clear remediations is already a package in the codebase. I'm. not opposed to each check owning its remediation logic, but I'm also not familiar with the code Laurent is talking about.
IMO, remediation has nothing to do with raw. raw is non-opinionated and has no concept of alerts / bugs, so should not be aware of remediations. I don't mind keeping it in raw temporarily, though. But long-term I don't think it should live in raw.
It's a bit of a misnomer currently, as it's just information about the repository that's needed to generate remediation text. If we're trying to make it so evaluation code only depends on raw data, but evaluation is where the remediation text gets generated, then that metadata about the project needs to live in the raw data somewhere.
@azeemshaikh38 I chatted with @laurentsimon. What do you think of adding the remediation.Metadata as an argument to the evaluation function signatures. md would be optional (could be nil), and just wouldn't generate remediations if not present. As a reminder, that type looks like:
type MetadataKey string
const (
BranchName MetadataKey = "branch"
RepoName MetadataKey = "repo"
)
type Metadata map[MetadataKey]string
The signature for evaluation.TokenPermissions would then look like
func TokenPermissions(name string, dl checker.DetailLogger,
r *checker.TokenPermissionsData, md remediation.Metadata) checker.CheckResult
Generating the remediation.Metadata map could go a few places:
- Inside pkg.RunScorecards() if we just want to generate the data once to be used by every check.
- Inside the various
checks/checkname.gofiles
Not a fan of (1) as it would require additional arguments added to multiple functions to bubble down. And the logic would be centralized, and pretty far from where its used.
I think (2) works well, as each check would own the code responsible for maintaining its own remediation metadata. It would look something like this for checks/permissions.go:
// TokenPermissions will run the Token-Permissions check.
func TokenPermissions(c *checker.CheckRequest) checker.CheckResult {
rawData, err := raw.TokenPermissions(c)
if err != nil {
e := sce.WithMessage(sce.ErrScorecardInternal, err.Error())
return checker.CreateRuntimeErrorResult(CheckTokenPermissions, e)
}
// Return raw results.
if c.RawResults != nil {
c.RawResults.TokenPermissionsResults = rawData
}
md := extractRemediationMD(c)
// Return the score evaluation.
return evaluation.TokenPermissions(CheckTokenPermissions, c.Dlogger, &rawData, md)
}
Stale pull request message