rust-bindgen icon indicating copy to clipboard operation
rust-bindgen copied to clipboard

bindgen header, link, etc functions take String instead of Path(Buf)

Open spease opened this issue 7 years ago • 7 comments

Probably pretty self-explanatory.

header() and link() (and other functions) on bindgen take Into<String>, even though the examples show them being used with paths (although not Path with a capital P).

I'd think this would be backwards compatible, since PathBuf is supposed to contain a superset of the characters as String.

spease avatar Apr 03 '18 01:04 spease

As an anecdote, I tried to build and generate bindings for a library using cc and bindgen. I've saved the shared root folder of the library as a PathBuf. cc uses AsRef<Path> for it's methods, such as cc::Build::file so a PathBuf works fine, but bindgen uses Into<String>, which means I need to call path.into_os_string().into_string().unwrap() or similar.

As a question for the package maintainers: does using AsRef<Path> make sense for bindgen?

tangmi avatar Jan 17 '19 00:01 tangmi

Yes, I think it does, and I think I'd take a PR to update this for the arguments that take a path.

emilio avatar Jan 17 '19 00:01 emilio

I'm happy to take a shot at this.

Do you think the methods should just take a impl AsRef<Path> and immediately convert them to String (and fail if not convertable) or if the backing data structures should store PathBufs and then convert later, as needed? It seems like at some point we need them in some string-ified form to pass to clang.

Edit: Here is a very non-invasive and naive implementation possibility: https://github.com/tangmi/rust-bindgen/commit/c835325b6ac83bdcad8524800e141d5374460e42

tangmi avatar Jan 17 '19 04:01 tangmi

Hmm, yeah, that's a good point... So ideally we'd turn the clang arguments into a bunch of Box<[u8]> or Vec<u8> (so we don't need to turn it into a string) and then pass it to clang, which really wants just C strings (see TranslationUnti::parse in src/clang.rs). Though that may need clang-sys changes as well.

emilio avatar Jan 17 '19 14:01 emilio

I'll take a peek at it later. Some notes I've made:

  • Path can be converted to std::ffi::OsStr losslessly
  • There's the std::ffi::CStr type, which we could use instead of Vec<u8> to represent C strings, though I'm not sure yet how that relates to OsStr or to TranslationUnite::parse

tangmi avatar Jan 17 '19 17:01 tangmi

@emilio I'm struggling to find a solution that doesn't do a lossy conversion between string types and am looking for advice.

The main issue(s) I am running into are:

  • Path and PathBuf are represented in a platform-specific way (UTF-8 on Unix and WTF-8 on Windows).
  • Command::arg[s] takes AsRef<OsStr>, which poses no problem, as we can convert Builder::command_line_flags, Builder::dump_preprocessed_input, and Builder::rustfmt_generated_string to build a Vec<OsString> instead of a Vec<String>.
    • I looked at cc::Build, which basically wraps a Command that calls a compiler and builds arguments lists with AsRef<OsStr>.
  • bindgen uses clang-sys, which takes a *const c_char (or CString).
  • CStr/CString work with Str/String on all platforms, but not nessecarily with OsStr/OsString (there exists std::os::unix::ffi::OsStrExt::as_bytes on Unix only).

We need data as:

  1. &[u8] for writing to files
  2. AsRef<Path> for use in Command::arg
  3. *const c_char/CStr for use in clang_sys

From this, it seems that on Unix, we can use AsRef<Path> from user input and convert to the proper string types. On Windows, however, there has to be some lossy conversion (or error propagation) at some point, as we can't guarantee that conversions for 1. and 3. can be done safely.

I'd like to represent Builder and BindgenOptions with PathBuf for the relevant fields, as it seems more semantic than String (or alternatives), but before I go too far down that path, I'd like to know your thoughts on this.

Thanks!

tangmi avatar Jan 21 '19 20:01 tangmi

I see three paths here:

  1. Handle windows via the lossy path: OsString -> String -> CString and return an error when trying to set a clang_arg with an OsString that cannot be converted in this way.
  2. Change the unix methods only so they take AsRef<OsStr> instead and let windows keep using AsRef<str> and Into<String>.
  3. Leave things as they are right now.

As I'm not a windows developer I wonder what happens if you pass a windows wide string as an argument to clang.

pvdrz avatar Sep 14 '22 22:09 pvdrz