yamllint icon indicating copy to clipboard operation
yamllint copied to clipboard

handle symlinks in a semi dry manner

Open james-powis opened this issue 1 year ago • 3 comments

The Goal of this PR is to incorporate a fairly clean way to handle symlinks.

Symlinks are exceptionally problematic, because:

  • symlink may or may not match existing matching strategies
  • symlink target may or may not match existing matching stragegies
  • the symlink may reference an already matched yaml file
  • there may be many symlinks which all target the same file

This solution aims to:

  • ensure the symlink matches yaml file patterns
  • ensure the symlink is not excluded
  • ensure the symlink target matches yaml file patterns (This is one of those problematic things)
  • ensure the symlink target is not excluded
  • by using set(find_files_recursively(...)) we ensure the results from find_files_recursively is not duplicated.

This solution falls flat to be a good solution because:

  • it does not solve for handling absolute symlink paths
  • wrapping a function using yield with set() is memory inefficient.
    • Ultimately I cannot think of a solution which can be memory efficient without being duplicative when encountering symlinks.

My hope with this MR is to start a discussion about how best to handle symlinks. And my current recommendation is to skip them entirely. Leave it up to the user to target the FILES they want linted specifically.

james-powis avatar Jan 18 '24 00:01 james-powis

Coverage Status

coverage: 99.797% (-0.03%) from 99.822% when pulling ce64fc24e5c0cc68598ddb17941db9cc73b2302c on james-powis:symlink-dry into 3cb3a2038569370a2e44bcda28a378dac96c6c10 on adrienverge:master.

coveralls avatar Jan 18 '24 00:01 coveralls

This pull request is not reviewable as is: function-local code style not respected, quotes, inclusion of yamllint/.cli.py.swp (congrats for using Vim by the way :wink:), remains of your tests like maxDiff = None, independant changes like set() that should have their own commit / review, etc.

But I understand that it's intended, and you want to start a discussion about symbolic links. To be honest I have a hard time understanding your commit message (e.g. the "strategies"), and what "this solution" consists in. You should describe it.

And my current recommendation is to skip them entirely.

I don't think all yamllint users would agree, this would be a breaking change.

adrienverge avatar Jan 20 '24 08:01 adrienverge

The goal of this PR was not to actually get this accepted, it was to demonstrate one of the few problems symlinks present. Especially around excluding... Do you apply exclude the symlink or the target, or both, none? How do you handle the reporting of the lint issue, do you report the link, or the target?. How do you prevent 100 million symlinks to the same file not result in the running of yamllint 100 million times.

Ultimately, if the file should be linted, it should be linted at the source not the one of potentially numerous links to it. By excluding the symlink entirely, the users of yamllint have the option to target the target via include if so desired file. But that is just my opinion.

james-powis avatar Jan 21 '24 03:01 james-powis