pipe-rename
pipe-rename copied to clipboard
Store each path as a Path instead of a String (which is utf8-only)
Comment on: https://github.com/marcusbuffett/pipe-rename/blob/5658581f1f3b62ef086a18931d41999c09ee93ed/src/main.rs#L168-L175
I'm thinking, the filter_map exists only because the conversion from an OsString to a String could fail if the path is not UTF-8 and we might get an Err(...) instead of an Ok(String).
Edit: the filter_map also exists because each entry in from read_dir can be an error, my bad.
My observation is still valid though and is on: .into_string().ok() which will return None for a non-utf8 path, and discard it.
I don't think that's right, the tool shouldn't skip non-utf8 paths.
@marcusbuffett I'm wondering if we could change something in the entire codebase:
--> Instead of storing each path in a String (utf8), we could simply keep it as a Path (which can contain non-utf8).
For the occasional cases where we need to print a path (like for the diff), we could use a replacement char for the non-utf8 chars. With a note for the user at the end of the diff if at least one path is non-utf8, the detection / change could be done before the diff to not over complicate it.
NOTE: I'd be interested to take a stab at this if you're ok with this
@bew i was thinking along the same lines when I updated clap, since you can parse arguments as PathBuf instead of string, but then I ended up having to unwrap stuff to match everything else. I agree we should be working with PathBufs when possible, would love for you to take a stab at it
Or OsString or whatever, not totally sure, I leave it up to you
That OsString documentation scared me. I'm curious how vim deals with the issue.
On Fri, Apr 30, 2021, 5:56 PM Marcus Buffett @.***> wrote:
Or OsString or whatever, not totally sure, I leave it up to you
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/marcusbuffett/pipe-rename/issues/30#issuecomment-830465185, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAE66RMWEZAGOLQFRU3QTJLTLM7TLANCNFSM435NM3AQ .
Just thinking out loud: wouldn't a change like this:
- cause issues across the Windows/non-Windows boundary because of
OsStringon Windows being 16-bit code units and 8-bit on other systems?OsStringtoStringand back should be lossless for Windows. For other systems this would only be guaranteed for UTF(-8)-based locales, though; i.e.$LANG. - move the onus of dealing with the encoding to the invoked command/editor (which may be hard to convey in a generic fashion for anything that's not UTF-8)?