ripgrep icon indicating copy to clipboard operation
ripgrep copied to clipboard

[globset] supporting paths with Windows separators

Open sunshowers opened this issue 3 years ago • 6 comments

Describe your feature request

Hi Andrew! Thanks for all your wonderful crates.

I noticed that globset currently seems to have certain Unix assumptions baked in, such as literal_separator only excluding / and not \.

I'm wondering if it would be possible to run it in a mode where it supports Windows paths, without having the path converted from \ to /. The general approach I'm thinking is:

  • globs are still provided in Unix format with / path separators.
  • literal_separator is turned on.
  • a new config match_backslash_separator is provided (people can turn it on on #[cfg(windows)], or maybe it could take an enum with values like Never, OnWindows, Always).
  • internally, globset converts / in the glob to [/\\] in the regex, and makes it so that literal_separator doesn't match either / or \.

I might have missed something but I believe this should work. (It could also run in a "canonical Windows path" mode, where / in the glob only matches \ in the string, but I'm not sure if there's a good use case for that.)

sunshowers avatar Sep 23 '21 18:09 sunshowers

I think there might be a ticket for this already.

In general, I'm not opposed to this, but I don't know when it's going to happen. The globbing/gitignore code is pretty hairy and one of the more common source of ripgrep bugs. So even just reviewing changes to it is difficult.

BurntSushi avatar Sep 23 '21 18:09 BurntSushi

Thanks! (I tried looking for a ticket but I couldn't find one -- totally might have missed it.)

I'm happy to give it a go if that works for you -- I have some experience in this area from working on source control.

sunshowers avatar Sep 23 '21 18:09 sunshowers

@sunshowers Seeing as how this hasn't landed yet, how are y'all using globs for windows targets? Converting everything to / first?

milesj avatar Feb 23 '22 07:02 milesj

Yes, I've come to believe that using / as the canonical relative path separator, and \ as the canonical absolute path separator, makes the most sense on Windows.

sunshowers avatar Feb 23 '22 07:02 sunshowers

I support the idea of adding a possibility to canonize the file paths because currently rg on Windows produces mixed-separator paths sometimes:

# rg guid Sample/Dir0
Sample/Dir0\Dir1\File.meta
2:guid: 550e8400e29b41d4a716446655440000

@BurntSushi may be there should be option for choosing preferred path separator for the output? Like OS native (default), Unix style, Windows style (not sure about having the latter)

Habetdin avatar Aug 26 '23 11:08 Habetdin

@Habetdin I'm not sure if this ticket is really what you want here. This ticket is about a globset (a library used by ripgrep) feature.

Otherwise, ripgrep is printing a mixed path because you're using it on a system whose native path separator is \ but you've provided a file path using /. You can set the path separator that ripgrep uses with the --path-separator flag.

If you have a follow up, please don't pollute this issue. Please ask a question instead.

BurntSushi avatar Aug 26 '23 12:08 BurntSushi