exclusion patterns are sensitive to current working directory
Hello!
I hope this is formatted acceptably, I've tried to provide context, observed & expected behaviours, possible solutions and a workaround I can use but would like to avoid.
Please let me know if I must provide any further information. If you like any of my proposed approaches I think I would be able to put together a pull request if that helps :)
Background Context
I'm currently writing a script that uses lizard.py as a library, in which I call lizard.analyze() on several directories with differing parameters. In my script I allow the user to specify exclude_patterns for each directory to be analysed by lizard.
The script is invoked from a parent directory, meaning it does not run lizard.analyze() with the paths equal to ["."], but rather ["./some/path/here"]. This means filenames lizard generates while walking the target directories are prefixed with their relative path to my script, e.g.: "./some/path/here/then/a/specific/file.txt".
This is fine, and everything works, with one slight problem: the exclude_pattern parameters to lizard are matched against the file paths lizard generates while walking, including the prefixes.
This means that, unlike the examples in the README which show concise exclude_pattern patterns relative to the analysis directory, my users must specify patterns that are aware of their target directory relative to their current working directory. E.g.:
What would have been: ./tests/*
Now becomes: ./repos/specific-repo/tests/*
I'm wondering if this is intended behaviour, should exclude_patterns be matched against the relative path from the analysis directory? Or should it remain how it is: matching against the relative path from the current working directory?
Observed Behaviour:
exclude_patterns match against the file's path relative to the current working directory
Expected Behaviour:
exclude_patterns match against the file's path relative to the directory being analysed
Proposed Changes
Change exclude_patterns matching behaviour?
In this approach, the code in get_all_source_files() is changed to apply the exclude_patterns to file paths relative to the directory being analysed.
all_listed_files()(lizard.py:893) changed to yield a tuple of (pathand path to file relative topath)_validate_file()(lizard.py:884) changed to apply exclude patterns against the relative part (second element of tuple from previous point)get_all_source_files()has a finalmapcall added that flattens the tuple into a full path again, e.g.:os.path.join(tuple[0], tuple[1])
Pros
- No changes to the public API
Cons
- Changes the current behaviour of the library, which may disrupt current users
- Fixes my specific use-case, but doesn't account for the many ways somebody might want to filter files
Allow user to provide custom function for use in _validate_file()?
In this approach, the public API of the library is adjusted to allow the caller to pass a callable/function/lambda that is invoked in _validate_file() (lizard.py:884). This would allow a caller to provide completely arbitrary filtering logic without the library caring how it's done.
In my case, this would allow me to manipulate the path to be relative to the directory I want it to be relative to before applying exclusion patterns myself.
Pros
- Does not change the current library behaviour (current users unaffected)
- Allows any user to customise how they want to filter files using arbitrary code
Cons
- Change to public API, but shouldn't actually affect anybody (would add a new parameter for passing a callable that defaults to None, or a dummy filter that always returns True, but anyone not currently passing this will get the default behaviour that is consistent with current behaviour)
Workarounds & Alternatives
I can work around this by making my script prefix the user's exclude_patterns with the relative path from the current directory to the analysis directory, but it's a leaky abstraction that I'm concerned may surprise users - so I'm curious to know if lizard is supposed to behave this way?
PS, in case it's useful for additional context, I've now made the script I'm working on available here.
Within that file: the analyse_repo() function calls into lizard on this line in particular. Notice that before I can pass in exclude_patterns, I must first patch them using the patch_relative_exclude_patterns() function here.
Hi Oliver,
Thanks for your proposal. Just landed at Tokyo after 7h of flight. Let me get to the details a bit later.
On 26 Aug 2018, at 7:33 AM, Oliver Maskery [email protected] wrote:
PS, in case it's useful for additional context, I've now made the script I'm working on available here https://github.com/omaskery/lizard-monitor.
Within that file: the analyse_repo() function https://github.com/omaskery/lizard-monitor/blob/master/lizard-mon.py#L74 calls into lizard on this line in particular https://github.com/omaskery/lizard-monitor/blob/master/lizard-mon.py#L93. Notice that before I can pass in exclude_patterns, I must first patch them using the patch_relative_exclude_patterns() function here https://github.com/omaskery/lizard-monitor/blob/master/lizard-mon.py#L79.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/terryyin/lizard/issues/238#issuecomment-416000794, or mute the thread https://github.com/notifications/unsubscribe-auth/AAwJYtb_9M2LU0RGjj82CSj8HqXX5xV9ks5uUdDAgaJpZM4WMXS_.
Not a problem, not in a rush, hope you had a good flight :)!