rfcs icon indicating copy to clipboard operation
rfcs copied to clipboard

Bringing `OsStr` and `CStr` up to par with `str`

Open codyps opened this issue 10 years ago • 8 comments

Redirected from https://github.com/rust-lang/rust/issues/22741

Using starts_with() as an example: Right now one can do this on OsStr by:

  • requiring the OsStr to be utf-8 (and converting it to str/String)
  • Or implimenting a unix version using as_bytes(), and a windows version using encode_wide()

Unfortunately, the lack of a common method for iteration over elements of the OsStr means that any out-of-std implimentation of these will be un-happily platform specific.

codyps avatar Feb 24 '15 05:02 codyps

It's not clear to me why we can't expose the underlying representation with some sort of .as_bytes(). Is it just because we assume people are gonna assume that on Windows that means it's the active codepage and not WTF-8?

cc @eddyb

ben0x539 avatar Apr 23 '16 17:04 ben0x539

#1876 discusses some ideas for how slicing could work for CStr.

dtolnay avatar Nov 17 '17 02:11 dtolnay

See also:

  • https://github.com/rust-lang/rust/pull/26499, an unmerged implementation of starts_with and ends_with and discussion of generalizing the Pattern API to OsStr.
  • https://github.com/rust-lang/rfcs/pull/1309, a promising RFC involving the OsStr API.
  • https://github.com/rust-lang/rust/issues/40300, discussion of prefix comparison for OsStr.

dtolnay avatar Nov 17 '17 02:11 dtolnay

For everyone still waiting on this issue, I created OsStr Bytes, which should make OsStr and OsString much easier to use.

dylni avatar Nov 29 '19 18:11 dylni

@dylni I realize you're intentionally invoking UB in that crate, but it's still UB and I therefore wouldn't recommend that approach: https://github.com/dylni/os_str_bytes/blob/ff8a9ed5e7d50b9ff63ea20bfd460a7f481340c1/src/windows.rs#L16-L28

bstr does as much as is possible with as little work as possible without breaking abstraction boundaries or invoking UB: https://docs.rs/bstr/0.2.8/bstr/#file-paths-and-os-strings

BurntSushi avatar Dec 04 '19 13:12 BurntSushi

@BurntSushi I completely agree and was hoping to hear some opinions on this, which is why I put it on the front page of the documentation. My plan is to come up with something better before a 1.0 release, but I haven't yet decided what the best option would be. However, I don't agree with the approach of not being able to handle some valid arguments on Windows, which is why I created the crate.

I may just end up copying next_code_point() from the standard library, but I haven't decided yet. The problem with this is that bytes could be mangled if the standard library implementation changes, which is why I've avoided it so far.

dylni avatar Dec 04 '19 14:12 dylni

@BurntSushi That method is no longer called.

dylni avatar Dec 07 '19 16:12 dylni

About starts_with, how about

impl OsStr {
   fn starts_with(&self, prefix: &OsStr) -> bool
}

?

This way I won't have to handle Option like osstr.to_str().map(|x| x.starts_with("prefix")).unwrap_or(false);, but I would be able to do osstr.starts_with(String::from("prefix").into()).

mzr avatar Jun 23 '22 13:06 mzr