rfcs
rfcs copied to clipboard
Bringing `OsStr` and `CStr` up to par with `str`
Redirected from https://github.com/rust-lang/rust/issues/22741
Using starts_with() as an example:
Right now one can do this on OsStr by:
- requiring the OsStr to be utf-8 (and converting it to str/String)
- Or implimenting a unix version using
as_bytes(), and a windows version usingencode_wide()
Unfortunately, the lack of a common method for iteration over elements of the OsStr means that any out-of-std implimentation of these will be un-happily platform specific.
It's not clear to me why we can't expose the underlying representation with some sort of .as_bytes(). Is it just because we assume people are gonna assume that on Windows that means it's the active codepage and not WTF-8?
cc @eddyb
#1876 discusses some ideas for how slicing could work for CStr.
See also:
- https://github.com/rust-lang/rust/pull/26499, an unmerged implementation of
starts_withandends_withand discussion of generalizing thePatternAPI toOsStr. - https://github.com/rust-lang/rfcs/pull/1309, a promising RFC involving the
OsStrAPI. - https://github.com/rust-lang/rust/issues/40300, discussion of prefix comparison for
OsStr.
For everyone still waiting on this issue, I created OsStr Bytes, which should make OsStr and OsString much easier to use.
@dylni I realize you're intentionally invoking UB in that crate, but it's still UB and I therefore wouldn't recommend that approach: https://github.com/dylni/os_str_bytes/blob/ff8a9ed5e7d50b9ff63ea20bfd460a7f481340c1/src/windows.rs#L16-L28
bstr does as much as is possible with as little work as possible without breaking abstraction boundaries or invoking UB: https://docs.rs/bstr/0.2.8/bstr/#file-paths-and-os-strings
@BurntSushi I completely agree and was hoping to hear some opinions on this, which is why I put it on the front page of the documentation. My plan is to come up with something better before a 1.0 release, but I haven't yet decided what the best option would be. However, I don't agree with the approach of not being able to handle some valid arguments on Windows, which is why I created the crate.
I may just end up copying next_code_point() from the standard library, but I haven't decided yet. The problem with this is that bytes could be mangled if the standard library implementation changes, which is why I've avoided it so far.
@BurntSushi That method is no longer called.
About starts_with, how about
impl OsStr {
fn starts_with(&self, prefix: &OsStr) -> bool
}
?
This way I won't have to handle Option like osstr.to_str().map(|x| x.starts_with("prefix")).unwrap_or(false);, but I would be able to do osstr.starts_with(String::from("prefix").into()).