wildmatch
wildmatch copied to clipboard
Fix ** matches to explore multiple possible matches
fixes #14
Previously the doublestar token would match greedily as much as it could, so for example, the double star token of the pattern **/*/file
, when applied to a directory string a/b/c/file
, would greedily match all of a/b/c/file
and not leave enough for the rest of the pattern (*/file
).
The solution implemented here changes the token interface so that the Consume method returns a slice of paths([]string) that the token could match. These results are added to a stack and iterated over until a solution is found. So in the above example, instead of just returning an empty slice of strings (because the **
consumed everything), it would return the following slice:
[
[""]
["file"]
["c" "file"]
["b" "c" "file"]
]
So now, the rest of the pattern (file
) can match the second item returned.
Thanks for the feedback. I definitely could have missed something and it might not work in some cases.
Actually, this problem actually has two components; the first is determining whether or not a match string matches a path name. This part seems to work as intended. I tried to cover the test cases but maybe there is a test case that could break it though.
The second problem is that if it doesn't match, then it should return the remaining directory paths. This is actually what that longest path is for. As we are going through all of the match possibilities, it keeps a record of the longest path it has seen. This was less clear as none of the tests cover what this []string value should be. Maybe another test could be made which tests this []string from the consume method.
This part seems to work as intended. I tried to cover the test cases but maybe there is a test case that could break it though.
That's good to hear. I am still fearful that there is a particularly bad case that neither of us are thinking of. @git-lfs/core: are there any such cases that you can think of?
The second problem is that if it doesn't match, then it should return the remaining directory paths.
I think that the current implementation of consume()
provides us with this information without the requirement we modify it. For instance, consume()
returns a ([]string, bool)
indicating (1) which directory components were matched and (2) whether or not there is any left.
If there are directory components left over, I think that we could do an array-wise prefix difference from initial directory components and the ones that was matched. Anything in the suffix of the original directory structure (e.g., the parts not in the dirs
slice) is what is left over.