gitlint icon indicating copy to clipboard operation
gitlint copied to clipboard

Integration tests failing in CI because of missing ANSI color codes

Open jorisroovers opened this issue 3 years ago • 1 comments

Our CI builds broke on the integration tests for the gitlint commit message hook. This happened without any change in the code as far as I can tell. The build actually broke a while back, I just hadn’t noticed since I’m in a low activity period wrt gitlint maintenance.

My best guess is that this is because of a github CI environment change, because all the tests run fine locally and they were working fine a while back.

Specifically, several tests are failing on missing ANSI color characters in the gitlint hook output. Our integration tests check for these ANSI color characters, but in the CI run the ANSI color codes are no longer there. It seems all the failing tests have the same root cause (missing ANSI color codes).

Example failing tests: test_commit_hook_no_violations. Here’s where we check for the ANSI codes in the test, those are missing from the test output in CI.

https://github.com/jorisroovers/gitlint/blob/63500a584473565451064ac751024691c40d4b95/qa/test_hooks.py#L66

gitlint uses click to do the coloring, which uses colorama under-the-hood. See here:

https://github.com/jorisroovers/gitlint/blob/63500a584473565451064ac751024691c40d4b95/gitlint-core/gitlint/cli.py#L381-L386

This is the only place where gitlint uses terminal colors.

The only reason I can think that those colors being disabled is because click/colorama is not detecting a session terminal (= no tty connected to stdin).

This is probably caused in how we invoke gitlint from the integration tests, using the sh library:

https://github.com/jorisroovers/gitlint/blob/63500a584473565451064ac751024691c40d4b95/qa/shell.py#L8-L11

Side-note: I’ve done most of the legwork to remove the need for sh in the integration tests (a dependency I ultimately want to remove), but the hook tests are the last ones that depend on it, because of some stdin/stdout convenience functions provided by sh that I haven’t gotten around to eliminating.

What I don’t understand though, is that this has now become a problem, while it was working fine before.

I did find https://github.com/actions/runner/issues/241, which might be related, but again, that doesn’t explain why tests worked for a long time.

I’ve tried a few things in #297, but have not been able to solve this 😒 stdin/tty problems 😱

jorisroovers avatar Jun 07 '22 21:06 jorisroovers

Ok, after trying a bunch of different things, I was able to fix this by dynamically patching the commit-msg during CI runs in github, see #304. This is not ideal, but it’s the least intrusive IMO, as I really didn’t want to change gitlint code itself to account for issues that are very specific to the Github CI environment and nowhere else.

There’s a lot of things going on in this issue that I don’t fully understand, but for future reference, here’s what I do know.


The issue

Because no TTY is attached to STDIN in gh CI, click is not outputting color codes. This is generally desired behavior (e.g. when piping commands), but we want to be able to test color output functionality during our integration test runs. We do this by forcing a TTY to be attached to STDIN during testing. This worked fine before, but recently broke in github CI and tests started failing because our tests expect ANSI color codes, and those are no longer outputted by gitlint run-hook. Integration tests work fine locally.


The root cause

I confirmed that STDIN detection is working in the code, and that the issue is that our integration tests can no longer attach a TTY to STDIN in the GH CI runner. Why this only now is a problem, and not much earlier as the related https://github.com/actions/runner/issues/241 has existed since 2019, I don’t know. Something must have changed in addition more recently.

  • It think it might be possible to determine when this started breaking by downloading and running different versions of the action runner locally (https://github.com/actions/runner), but I honestly don’t want to go through all that additional hassle (especially since it might still be dead end). I’ve found a fix that works now, that’s the most important thing.

The fix

By wrapping the gitlint invocation in a script command, you can attach a (fake?) TTY and that will fix the issue.

  • I first found out about this here: https://github.com/gfx/example-github-actions-with-tty

  • Just setting shell: 'script -q -e -c "bash {0}"' in github CI doesn’t however fix our particular problem, because in the particular failing tests gitlint is being invoked from within the commit-msg script, which in itself is invoked by git.

  • The actual fix involves wrapping the invocation of gitlint inside the commit-msg with script -q -e -c.

    # Normally we invoke gitlint inside `commit-msg` like so:
    gitlint --staged --msg-filename "$1" run-hook
    
    # To make output colorization work in gh CI, we need to invoke gitlint like so inside `commit-msg`:
    script -e -q -c  "gitlint --staged --msg-filename \"$1\" run-hook"
    
  • I have used sed to make this change dynamically during CI runs prior to invoking the integration tests.

  • I also had to make a few minor integration test changes, but I wouldn’t consider those workarounds for this particular issue, more brittle assert statements that this particular issue brought out.


sh library was not the problem

This entire issue has nothing to do with the use of sh in the integration tests. As part of this fix I did explore removing sh all-together but found its still a bit more involved than I thought it was. Punting this to the future.


Click and colorization

jorisroovers avatar Jul 14 '22 10:07 jorisroovers

Ok, after searching through Click issues (again), I found https://github.com/pallets/click/issues/2207 and a bunch of duplicates which removes the need for me to open a new issue. As part of those issues, I discovered that Click does in fact have an undocumented color option that you can use like so:

if __name__ == "__main__":
    color = os.environ.get("NO_COLOR")
    color = color in {"1", "true"} if color is not None else None
    cli(color=color)

Alternatively, I read it's also available as part of ctx.color, which might be more interesting for gitlint as that would allow for something like this (pseudo code):

@click.option('--color', envvar='GITLINT_COLOR',...)
def cli(ctx, color ...):
   # ...
   ctx.color = True if color == "always" else False

Good enough for now - closing out this issue.

jorisroovers avatar Aug 12 '22 07:08 jorisroovers

@jorisroovers please also make sure to support the https://no-color.org standard along with the FORCE_COLOR env var.

webknjaz avatar Nov 21 '22 17:11 webknjaz

@webknjaz Ack, agree.

jorisroovers avatar Nov 22 '22 10:11 jorisroovers

Something has changed on the GHA side again, making the workaround implemented in #304 no longer necessary (in fact, the workaround breaks the tests now). I've merged #387 which removes the workaround. I'm glad this is fixed and the hacky workaround is no longer needed, but I don't understand what exactly changed. The fact that it happens more or less without warning isn't great either.

I don't want to go down a rabbit hole again at this point though to figure out more details - it's just not the best use of my time. The important part is that this is fixed. Hopefully no more back-and-forth on this :-)

jorisroovers avatar Dec 09 '22 11:12 jorisroovers