ansible-lint icon indicating copy to clipboard operation
ansible-lint copied to clipboard

fix: use ansible-compat to install collections

Open mnaser opened this issue 3 years ago • 7 comments

This patch switches ansible-lint to start using ansible-compat to install a collection if needed, this removes some duplicated code and also solves issues where both molecule and ansible-lint and running on the same system and fighting against each other (one creates symlinks, the other copies content).

Fixes https://github.com/ansible/ansible-lint/issues/2426

mnaser avatar Sep 20 '22 13:09 mnaser

@ssbarnea can I add a commit to bump the schema here or?

mnaser avatar Sep 20 '22 13:09 mnaser

I am fixing it separately before merging this fix but it is ok to include in your patch, just to enable testing.

ssbarnea avatar Sep 20 '22 14:09 ssbarnea

@ssbarnea should we drop most of those tests since they're essentially "testing" ansible-compat ?

mnaser avatar Sep 20 '22 15:09 mnaser

Nice one @mnaser.

I can confirm this passes (a slightly adjusted version of) the test I wrote to reproduce the same bug. See

  • https://github.com/ansible/ansible-lint/compare/main...heindsight:ansible-lint:cache-link-conflict and
  • https://github.com/ansible/ansible-lint/compare/main...heindsight:ansible-lint:cache-conflict-fixed

heindsight avatar Sep 20 '22 18:09 heindsight

I want to get this in but we need to pass all tests.

ssbarnea avatar Sep 21 '22 16:09 ssbarnea

ok, the reason it's pretty borked right now is that ansible-compat will check if the parent folder has a galaxy.yml and if so, install that as a collection.

however, in our case, it's an example (but invalid one) so the tests are failing.

https://github.com/ansible/ansible-compat/blob/main/src/ansible_compat/runtime.py#L354-L359

@ssbarnea what do you suggest in this case? can we rename that file? move it into its own folder? the current examples layout seems to make ansible-lint feel that it's a collection but it's galaxy.yml has invalid dependencies.

mnaser avatar Sep 21 '22 19:09 mnaser

ok, the reason it's pretty borked right now is that ansible-compat will check if the parent folder has a galaxy.yml and if so, install that as a collection.

however, in our case, it's an example (but invalid one) so the tests are failing.

https://github.com/ansible/ansible-compat/blob/main/src/ansible_compat/runtime.py#L354-L359

@ssbarnea what do you suggest in this case? can we rename that file? move it into its own folder? the current examples layout seems to make ansible-lint feel that it's a collection but it's galaxy.yml has invalid dependencies.

Hmm. Thanks point pointing this shady code. If the tests are failing because it happens to find a broken-example of a galaxy.yml file in a parent directory, just move the tests, or move the broken galaxy file (which is likely used by another test).

If you identify the fix, you can also opt-in to make a different PR, so it would be easier to review.

ssbarnea avatar Sep 21 '22 22:09 ssbarnea

@ssbarnea I'm not sure why some macOS tests are failing and why the eco jobs are failing

could you give me a hand if you have a few extra minutes?

mnaser avatar Sep 22 '22 15:09 mnaser

@mnaser is not your fault. We already spotted it few hours ago. I suspect Github updated the runners and put some surprises for us. I seen that ansible is failing syntax check in that particular case but we were not able reproduce.

I have a change to increase pytest verbosity and hopefully to find the root cause of the failure.

In general if all linux jobs are passing, your change is good (WSL is also unstable)

ssbarnea avatar Sep 22 '22 15:09 ssbarnea

@mnaser is not your fault. We already spotted it few hours ago. I suspect Github updated the runners and put some surprises for us. I seen that ansible is failing syntax check in that particular case but we were not able reproduce.

I have a change to increase pytest verbosity and hopefully to find the root cause of the failure.

In general if all linux jobs are passing, your change is good (WSL is also unstable)

cool, well in that case, the change is good to go

mnaser avatar Sep 22 '22 15:09 mnaser

@mnaser Sorry but I spotted a regression affecting one of the eco repositories, look at this job and you will something like:

 ERROR! Unexpected Exception, this is probably a bug: Cannot call rmtree on a symbolic link

This is for sure related to this change. Now, it might be possible that the bug is inside ansible-compat, but still, merging this fix, would require us to fix that bug first.

You can run the same locally if you want, tox -e eco.

ssbarnea avatar Sep 22 '22 20:09 ssbarnea

@mnaser I raised https://github.com/ansible/ansible-compat/issues/165 but I am not really sure about the solution.

Basically we need to avoid trying to install current collection using ansible-galaxy if we already symlinked it, as the tool will fail. If we do not symlink it, we might not be able to obtain good error messages while linting, as they would be originating from outside the current repository.

I hope to see more contributions from you in the future. Maybe we can talk at some point about zuul-jobs, which seems to bit slow in fixing the issues reported by the linter. Maybe we can join forces with @ianw to make it full green.

ssbarnea avatar Sep 22 '22 20:09 ssbarnea

@mnaser I fixed compat, made a new release and updated this PR to make use of it. I also checked the other two broken jobs but the errors reported on them are unrelated to these changes so I will merge it.

ssbarnea avatar Sep 23 '22 15:09 ssbarnea