codeql-action icon indicating copy to clipboard operation
codeql-action copied to clipboard

Move logs, SARIF, database bundle actions uploads to post: hooks

Open angelapwen opened this issue 2 years ago • 10 comments

Previously, even with debug mode on, if the init step failed we did not upload the appropriate Actions artifacts. This was because the artifacts were only uploaded in the analyze step.

This change:

  • moves the uploading of log files to a post: hook in the init step. Regardless of whether the entire workflow was successful or if any steps after init failed, we will upload whatever logfiles we have as an artifact.
  • moves the uploading of database bundles to a post: hook in the init step. If the database has not been finalized, instead of running the CLI database bundle command, the action directly zips up everything in the database directory.
  • moves the uploading of SARIF results (if they exist) to a post: hook in the analyze step.

So far the change has been manually tested so that:

  • when init fails, partial logs from init are uploaded as artifacts, and the partial database bundle is uploaded.
  • when analyze fails, logs from init and analyze are uploaded as artifacts; the partial database bundle is uploaded; and the SARIF upload succeeds if there is a SARIF file generated in the output folder of the action.
  • on workflow success, all file types are successfully uploaded.

Testing strategy:

  • [ ] unit tests for the scripts called by post: hooks
  • [x] integration tests showing what happens when init and analyze fail; will write after #1160 is merged.

Merge / deployment checklist

  • [x] Confirm this change is backwards compatible with existing workflows.
  • [N/A] Confirm the readme has been updated if necessary.
  • [x] Confirm the changelog has been updated if necessary.

angelapwen avatar Jul 29 '22 09:07 angelapwen

I'm correctly hitting the case where the database isn't finalized now, and just need to implement the zipping of the partial database files. I see documentation for the CLI command but am still not exactly sure which files I'm looking to zip. I'll take a look at the CLI source code but would appreciate any pointers!

Also, I see I have a CI failure on debug artifact which looks legitimate, as that's the code I'm changing. It only fails on ubuntu, as for some reason the step to download the artifact doesn't begin on the Mac runners. It looks like this is because the step to download the artifacts is executed before the post: hooks that upload them, so I'll have to find out how to reorder these.

I see the PR check comes from https://github.com/github/codeql-action/blob/b100b75d58470d253f593ae0e7205913f569450a/pr-checks/checks/debug-artifacts.yml#L16 which executes directly after the analyze step. Not entirely sure how to make sure it executes only after the post: hooks as I guess the entire point is that the post: hooks happen after everything else. 🤔

angelapwen avatar Jul 29 '22 10:07 angelapwen

I think perhaps splitting the PR check file into 2 jobs, one for running everything before checking artifact downloads, and one for just checking artifact downloads with a needs dependency on the former, might force things into the right order. Similar to the code blocks in the docs at https://docs.github.com/en/actions/learn-github-actions/essential-features-of-github-actions. I'll try to do this in a separate PR and see if it's at least a no-op without these new post: hook changes.

angelapwen avatar Jul 29 '22 13:07 angelapwen

I've made most of the requested changes, and will work on the refactoring for unit tests.

I also made the change to zip up the database bundle and it seems to be working as expected on a failed run.

The core.info logs state:

db-javascript is not finalized. Uploading partial database bundle at /home/runner/work/_temp/codeql_databases/db-javascript-partial.zip...

and then the zip file is uploaded as part of the artifacts. I can't attach it here due to file size limits but here is the directory structure of the partial database bundle file when unzipped: Screen Shot 2022-08-01 at 12 26 23 PM

angelapwen avatar Aug 01 '22 10:08 angelapwen

I added //TODOs in locations where I am planning to unit test, with the description of what each unit test should do, to get feedback on the planned tests before/while I write them.

angelapwen avatar Aug 01 '22 11:08 angelapwen

Quickly closing and re-opening to trigger new PR checks from actions/toolkit#1160

angelapwen avatar Aug 08 '22 11:08 angelapwen

I am working on integration testing with a new PR check (Debug artifact upload after failure) and have a few thoughts:

Would appreciate thoughts!

angelapwen avatar Aug 08 '22 14:08 angelapwen

with the init step, I was previously only simulating failure by throwing an exception in the code itself. I tried to add with: ram: 1 to the step but this seemed to be enough memory to pass the init step.

Let's not worry about this for now. We could try to convince init to fail by giving it malformed input arguments or config file, but I think init itself failing is an extreme case. We can focus for now on handling failures that occur after init.

with the analyze step, I successfully fail the step mid-way with the ram: 1 test, and the "Download and check" job successfully passes after I pass the continue-on-error flag to the "Upload" job (otherwise it won't even run the Download job). But this still causes a red X in the PR checks which is likely very confusing to anyone else opening a pull request. The https://github.com/actions/runner/issues/2347, but I'm not sure if that's worthwhile because the integration test could fail for another reason and we wouldn't necessarily know.

Good observation. We don't have existing integration tests for expected failures, for exactly this reason. A solution that works with branch protection is to make the new Download and check debug artifacts check required, but the Upload debug artifacts checks optional. This ensures PRs are blocked only if the debug artifact check is actually broken. However this does not save us from the ugly red X for Upload debug artifacts.

adityasharad avatar Aug 08 '22 17:08 adityasharad

Let's not worry about this for now. We could try to convince init to fail by giving it malformed input arguments or config file, but I think init itself failing is an extreme case. We can focus for now on handling failures that occur after init.

Okay, that sounds good.

Good observation. We don't have existing integration tests for expected failures, for exactly this reason. A solution that works with branch protection is to make the new Download and check debug artifacts check required, but the Upload debug artifacts checks optional. This ensures PRs are blocked only if the debug artifact check is actually broken. However this does not save us from the ugly red X for Upload debug artifacts.

Yes, that makes sense. At least we could make sure that the download job succeeds. I could change the PR check for Upload debug artifacts to include [Failure Expected] or something.

angelapwen avatar Aug 09 '22 08:08 angelapwen

Screen Shot 2022-08-09 at 4 56 23 PM What future PR checks will see. Hopefully this gets the message across? Maybe `Failure Intended` would be even more clear 😄

angelapwen avatar Aug 09 '22 14:08 angelapwen

That looks good. The other option I considered was turning this workflow off on PRs, and only running it on push events. Let's keep this for now, and if the "failing" check confuses other contributors we can adjust it.

adityasharad avatar Aug 09 '22 16:08 adityasharad

The remaining unit tests are up, requesting final review now 😄

After this is merged, I will add the "Download and check debug artifacts after failure in analyze" PR check/job to the required PR checks for each branch. I'll also write up an issue on improving the codeql database bundle command to handle non-finalized databases.

angelapwen avatar Aug 11 '22 14:08 angelapwen