cbindgen icon indicating copy to clipboard operation
cbindgen copied to clipboard

Support having multiple FFI crates

Open Manishearth opened this issue 3 years ago • 5 comments

I believe that right now cbindgen can basically be used once per project to generate a single header file, with no dependencies.

I'm working on ICU4X, which is a modular project, and we'd like the FFI story to be modular as well. It would be nice if individual crates could have their own FFI layers that can be selectively pulled in, but the duplicate definition problem doesn't occur if you import headers from a crate and its dependency.

For example, I might have the langid and pluralrules crate, with pluralrules depending on langid. It would be nice if langid and pluralrules could maintain their own FFI layers, with pluralrules building on that of langid, and if a project needs pluralrules it can pull in both header files. But if a project needs langid and a different crate, it can also do that.

Manishearth avatar Feb 04 '21 06:02 Manishearth

With existing cbindgen I solve this problem by slicing the crate into parts using #[cfg]s, e.g. my crate root looks roughly like this

#[cfg(cbindgen_low_level_data_structures)]
mod low_level_data_structures;
#[cfg(cbindgen_low_level_data_processor)]
mod low_level_data_processor;
#[cfg(cbindgen_high_level_data_structures)]
mod high_level_data_structures;
#[cfg(cbindgen_high_level_data_processor)]
mod high_level_data_processor;

and with 4 runs of cbingen with 4 different configs I generate four headers for each module. (cbindgen doesn't need the crate to compile to produce bindings, so a partial crate is ok.)

It's mildly annoying because

  • configs don't support inheritance and mostly duplicate each other with exception of the cfg features and includes (e.g. high_level_data_structures.h must include low_level_data_structures.h),
  • cbindgen produces a lot of warnings because the cfg features are (intentionally) not mentioned in the [defines] section of the toml config,
  • I have to add a couple of dummy "root" functions to type-only headers so cbindgen considers the types defined in e.g. mod low_level_data_structures used and produces C versions for them,

but the whole setup is more or less usable.

petrochenkov avatar Feb 04 '21 10:02 petrochenkov

So, how would the ideal API for this look like? Would a way to parse, but not emit items in an specific module / crate be enough here?

emilio avatar Feb 04 '21 15:02 emilio

So, how would the ideal API for this look like? Would a way to parse, but not emit items in an specific module / crate be enough here?

Yeah! It's fine if you have to say cbindgen .... --dep foo/. Another way to do this would be to have cbindgen generate a crate info json file that mentions all of the types "already generated", and you can point cbindgen to these files, so that cbindgen doesn't need to run itself twice on dependencies.

Manishearth avatar Feb 04 '21 16:02 Manishearth

Maybe we add a "/// cbindgen header: xyz.h" attribute for modules. When called in a "--multiple" mode, cbindgen would generate a header file for each module with this annotation only containing the publicly definied or reexported items in this module. The annotated module itself doesn't need to be public, so it can be created purely for cbindgen use.

nacaclanga avatar Sep 23 '21 09:09 nacaclanga

So, how would the ideal API for this look like?

Going along with @Manishearth's example, I think having the single top level FFI crate with something like pub use langid::ffi::* and pub use pluralrules::ffi::* would be a really clean way to solve this. The bindings would only ever need to be generated based on public exports from the top-level crate, and this still gives full control over selective re-exporting, all defined in familiar Rust syntax.

related: https://github.com/eqrion/cbindgen/issues/7, https://github.com/eqrion/cbindgen/issues/682

antonok-edm avatar Feb 22 '22 18:02 antonok-edm