scorecard icon indicating copy to clipboard operation
scorecard copied to clipboard

Using Scorecard GitHub Action reduces the Token-Permissions score

Open sethmlarson opened this issue 2 years ago • 20 comments

Describe the bug

urllib3 recently added the Scorecard GitHub Action to its repository. After adding this we observed our "Token-Permissions" score being 9/10 instead of the previous value of 10/10. This was due to:

Warn: jobLevel 'security-events' permission set to 'write': .github/workflows/scorecards.yml:16: update your workflow using https://app.stepsecurity.io/secureworkflow/syndicate-lang/syndicate-js/scorecards.yml/main?enable=permissions

This is because the action wants the permission security-events: write which is a permission that is docked points if given to any GitHub Action.

Reproduction steps Steps to reproduce the behavior:

  1. Add the Scorecard GitHub Action w/ security-events: write
  2. Run Scorecard on the repository
  3. Observe an imperfect score due to Scorecard GitHub Action config

Expected behavior A clear and concise description of what you expected to happen.

Using the Scorecard GitHub Action shouldn't reduce the project's scorecard score.

sethmlarson avatar Aug 16 '22 17:08 sethmlarson

@spencerschrock fyi

azeemshaikh38 avatar Aug 16 '22 17:08 azeemshaikh38

@sethmlarson slightly off-topic, but I am curious why you got a remediation link for syndicate-lang/syndicate-js? That seems like a different project, which does not have GitHub Actions workflows. For context, I am the maintainer for https://github.com/step-security/secure-workflows (which hosts APIs for app.stepsecurity.io), so want to understand if this is due to a bug...Thanks!

varunsh-coder avatar Aug 16 '22 17:08 varunsh-coder

@varunsh-coder Great question! I copied this output from this page: https://deps.dev/pypi/urllib3 to save me a second in running scorecard to get the output.

sethmlarson avatar Aug 16 '22 17:08 sethmlarson

Thats interesting. @azeemshaikh38, @laurentsimon any idea how deps.dev gets the remediation URLs? There seems to be a bug there.

varunsh-coder avatar Aug 16 '22 18:08 varunsh-coder

They are just taking the information from the scorecard result using the BQtable. Let's verify what in the BQTable for this project.

laurentsimon avatar Aug 16 '22 18:08 laurentsimon

Great find! So running the check locally seems to work fine, but our cron job table spits out the wrong result (which then propagates to deps.dev). Looks like this might be an interesting memory corruption bug - i.e we are persisting memory across runs of Scorecard which we shouldn't be doing.

@spencerschrock could you look into debugging this?

azeemshaikh38 avatar Aug 16 '22 19:08 azeemshaikh38

I highly suspect this line of code here: https://github.com/ossf/scorecard/blob/main/remediation/remediations.go#L47

azeemshaikh38 avatar Aug 16 '22 19:08 azeemshaikh38

Wrapping up tests for the commit for the original issue. can take a look at the cron job after

spencerschrock avatar Aug 16 '22 19:08 spencerschrock

I think there is a bigger problem here we need to address (maybe in a separate issue) - if we can so easily break the results shown on deps.dev, it'll be hard to build any kind of trust on Scorecard results. We need to tighten how these results get published. @brianrussell2 fyi

azeemshaikh38 avatar Aug 16 '22 19:08 azeemshaikh38

Re-opening. Can use this issue to track the cron job bug too.

azeemshaikh38 avatar Aug 16 '22 21:08 azeemshaikh38

I highly suspect this line of code here: https://github.com/ossf/scorecard/blob/main/remediation/remediations.go#L47

👍

The problem is removing the once.Do code would fix the issue with "stale" repo/branch names, but it introduces data races which cause the unit tests to fail. The issue is the raw results are generated in a different package (raw) than where the remediation text is generated (evaluation).

If we want to support concurrency (especially for something large like the cron jobs), I think the answer is dependency injection. That would just require changes to function signatures / raw data structs to accommodate it and I haven't found the cleanest way of doing it.

spencerschrock avatar Aug 16 '22 23:08 spencerschrock

Just a thought: remediation should belong in checks/evaluation instead of being a separate package (also maybe don't expose the remediation related fns)?

If every check generates its own remediation, it makes sense that it should just get generated on a per check basis inside checks/evaluation. @laurentsimon who may have more insights.

azeemshaikh38 avatar Aug 16 '22 23:08 azeemshaikh38

Just a thought: remediation should belong in checks/evaluation instead of being a separate package (also maybe don't expose the remediation related fns)?

no strong opinion from me here. I think the long-term plan was to take evaluation/ out of the checks/ in the future, which is why the remediation was out. But either works for me.

If every check generates its own remediation, it makes sense that it should just get generated on a per check basis inside checks/evaluation. @laurentsimon who may have more insights.

Maybe a simple fix is to remove the Once. I think I added this to avoid having the remediation function returns an error (and to have to check for the error for all calls), but if this Once is creating too many problems we can just remove it. We don't really need the caching since the repo client does it already. Would this help?

laurentsimon avatar Aug 16 '22 23:08 laurentsimon

Just a thought: remediation should belong in checks/evaluation instead of being a separate package (also maybe don't expose the remediation related fns)? If every check generates its own remediation, it makes sense that it should just get generated on a per check basis inside checks/evaluation. @laurentsimon who may have more insights.

I was still thinking there would need to be some sort of change here to bridge the gap between the checks and checks/evaluation packages. The checker.CheckRequest has info about the source of the request through RepoClient, which can provide repo and branch names for the Token-Permissions remediation text.

Maybe a simple fix is to remove the Once

I believe this leads to the data race problems I was talking about above. All checks are spun up as separate goroutines as part of pkg.runEnabledChecks, and multiple checks (Token-Permissions and Pinned-Dependencies) could be setting the globals in the remediation package when calling Setup. This causes go test or make unit-test to fail

WARNING: DATA RACE
Write at 0x000001f1b530 by goroutine 62:
  github.com/ossf/scorecard/v4/remediation.Setup()

spencerschrock avatar Aug 16 '22 23:08 spencerschrock

I was still thinking there would need to be some sort of change here to bridge the gap between the checks and checks/evaluation packages. The checker.CheckRequest has info about the source of the request through RepoClient, which can provide repo and branch names for the Token-Permissions remediation text.

Making this change is not urgent, IFAICT. It's re-factoring, does not need to be in this PR.

I believe this leads to the data race problems I was talking about above. All checks are spun up as separate goroutines as part of pkg.runEnabledChecks, and multiple checks (Token-Permissions and Pinned-Dependencies) could be setting the globals in the remediation package when calling Setup. This causes go test or make unit-test to fail

I meant remove the global as well if the info can be retrieved via repo client directly.

laurentsimon avatar Aug 16 '22 23:08 laurentsimon

I was still thinking there would need to be some sort of change here to bridge the gap between the checks and checks/evaluation packages.

Not sure why this is needed? We can simply move all the remediation logic into checks/evaluation and remove the Once?

azeemshaikh38 avatar Aug 17 '22 00:08 azeemshaikh38

It's the end of the day, so I might just be missing something obvious, but my thought is:

Inside checks/evaluation where the logic for the remediation check already is here, there is no way to access any variable (such as the RepoClient) that has access to the repo URI/branch.

spencerschrock avatar Aug 17 '22 01:08 spencerschrock

You're correct, the repoClient is the reason why it's not under evaluation. The reasoning is that evaluation should have all the information required to apply a policy without further API calls, as would be the case for someone fetching the results from the REST API / BQ table.

If some information is missing, maybe it should be added in the raw results.

I think we have hijacked the original intent of this issue, though :-)

laurentsimon avatar Aug 17 '22 16:08 laurentsimon

Inside checks/evaluation where the logic for the remediation check already is here, there is no way to access any variable (such as the RepoClient) that has access to the repo URI/branch.

I see, that makes sense. Like @laurentsimon mentioned, I think the right way to solve this is to have info like URL, branch_name etc. in RawData. But I know it's not an easy change and might take time. So, let's remove the init() fn and all the global vars for now (that should solve the data race issue too). We need to fix this before EOW so that the cron job picks up the fix.

Once the fix is in, let's reconsider how to solve it the right way using RawData. Wdyt?

azeemshaikh38 avatar Aug 17 '22 16:08 azeemshaikh38

I'm fine with that.

laurentsimon avatar Aug 17 '22 16:08 laurentsimon

The remediation data made its way into raw data in #3632

spencerschrock avatar Nov 21 '23 01:11 spencerschrock