openmc icon indicating copy to clipboard operation
openmc copied to clipboard

Adding tmate action to CI for debugging

Open pshriwise opened this issue 1 year ago • 3 comments

Description

This adds tmate to our CI in a mode that will only generate a connection if the CI action has failed. The connection will remain open for 15 minutes to allow someone with access to the action output to connect to the CI machine (once connected, the action will remain live until the user exits the terminal session).

Alternative options for deploying tmate are discussed in the issue below.

Fixes #3137

pshriwise avatar Sep 20 '24 15:09 pshriwise

This could be helpful to investigate the PR #3087

ahnaf-tahmid-chowdhury avatar Sep 30 '24 05:09 ahnaf-tahmid-chowdhury

Okay I think I found a solution I'm happy enough with here, even if it's a little less elegant than I'd like. Contributors can now submit a commit with a message that contains [gha-debug] to their PR to produce a detached tmate session that allows all steps of the workflow to execute and then provides an SSH command to allow one to login and perform whatever problem solving is needed (default timeout is 10 min).

Github actions surprisingly doesn't make it very easy to get the latest commit information for all events that trigger workflows as part of the github.event context in the workflow file. The pull_request event requires a checkout of the repo with more depth so the merge commit parents can be extracted to get the correct message for determining whether or not the tmate session should be enabled.

pshriwise avatar Oct 11 '24 03:10 pshriwise

Just smoothed out one final snag here: The tmate action was causing jobs to fail when timing out, but I wasn't a fan of that. The continue-on-error flag for that step did the trick, so CI runs that trigger the tmate debug session with [gha-debug] in their commit messages should pass checks if the tests pass. This should keep PRs from requiring another CI session if the tests pass with the tmate session enabled.

Here's an example of this working in my fork: https://github.com/pshriwise/openmc/actions/runs/11335356157

I'll note that the errors do still appear in the summary, but aren't really of consequence.

pshriwise avatar Oct 14 '24 23:10 pshriwise

As is, this will require a PR author to be aware of the "[gha-debug]" special string and make a commit with that. I personally don't see how this is much easier than letting people be aware of tmate in general and having them add the few lines in ci.yml since either way they will have to make an extra commit. I would prefer the latter for simplicity.

Hrmm that's a good point about awareness. It's something of an easter egg at the moment. I think there are ways to address that though by adding notes in the documentation testing section and to the pull request template.

I'd love for more people to be aware of the tmate action for sure since it's so useful, and, yes, it's true that a commit is required either way, but what you've outlined above requires a change to ci.yml that will have to be removed later before the PR is merged (if the author and reviewer remember). It just feels more noisy to me. I think will also save contributors time determining:

  • where the ci.yml file is,
  • where in the ci.yml file they should place the tmate lines,
  • and what options for the tmate action they should use

as well as maintainers' time providing that explanation if needed instead of "run $ git commit -m "[gha-debug]" --allow-empty and watch CI for a login line".

At the end of the day, It's up to you bit I wouldn't mind saving some cycles for CI debugging when it's needed.

pshriwise avatar Nov 08 '24 23:11 pshriwise

I would agree with @pshriwise , as a user/dev, I feel that adding an empty commit, is simpler than tweaking the ci.yml (one way to get access to the runner, and back to remove it at the end)

bam241 avatar Jan 15 '25 19:01 bam241

Ok, you've won me over on this. Before I approve/merge though, can you add something brief in the documentation (probably devguide/tests.rst) that explains how to use this?

🙏🏻

Gladly!

pshriwise avatar Jan 24 '25 17:01 pshriwise