pgrx icon indicating copy to clipboard operation
pgrx copied to clipboard

Prevent symbols from system headers ending up in generated bindings

Open thomcc opened this issue 2 years ago • 2 comments

(Sorry if this issue exists already, I couldn't find it)

pgx-pg-sys's generated bindings seem have a lot of symbols that come from system headers which have been included by postgres headers, rather instead of coming from postgres itself. This causes some significant perf issues, but also some correctness/safety concerns having to do with how system headers do symbol aliasing.

Unfortunately on system libraries symbols frequently[^1] get conditionally aliased in various ways, often to handle stuff like 64 vs 32 bit ino_t/off_t/time_t/etc. As an example, the various DIR functions (readdir, seekdir, etc) are exported from the generated bindings on x86_64-apple-darwin, but the actual symbols 64 bit processes should be calling on that target are somewhat more complicated. Because these get aliased in the system headers via an asm label attribute and not a define, bindgen doesn't understand it and gets the wrong symbol name.

The outcome is that we might be exposing functions with slightly wrong ABIs on some targets. While this is bad... in practice it's farily unlikely anybody is going to use pgx_pg_sys::seekdir rather than libc::seekdir (although, perhaps it could happen if editor autocomplete guesses wrong, or in a glob-import situation). So in practice it's perhaps more a compile-perf and usability issue than anything else.

The easiest way to fix this is the allowlist_file feature of bindgen which would let us only generate bindings that are from certain files. (Edit: Previously I indicated this was in an unmerged fork, but it actually has merged).

Alternatively, we could collect the symbols from the postgres dynamic libraries (perhaps when building with cargo-pgx), and use that to produce a bindgen allowlist. This is more annoying, and sadly doesn't handle type definitions (unlike allowlist_file), but OTOH I suspect we want to know what symbols these are anyway.

[^1]: There are 304 instances of link_name in rust-lang/libc's src folder, for example.

thomcc avatar Oct 03 '22 21:10 thomcc

Unfortunately I have witnessed some users doing use pgx::pg_sys::*; in their extension code, so this is in fact an active concern and not merely a hypothetical one! I believe Postgres 15 will start discouraging them from doing that (a new symbol in Postgres 15 is named String, just like our std::string::String) but not everyone's going to immediately migrate the first chance they get.

workingjubilee avatar Oct 03 '22 21:10 workingjubilee

In a way, the fact that these might be used is really an argument against my assumption that it's mostly a perf issue. Instead, it could badly broken code, as in general bindgen does badly with system headers (they're too full of obscure C compiler features). Still, you're right we should be careful about breaking changes.

I guess instead of --allowlist-file, perhaps the short term would be replacing the bindings from libc with reexports from the libc crate where possible[^1]? Longer term I suspect don't want pgx_pg_sys to provide anything besides postgres APIs and re-exports of required types from the libc crate.

[^1]: Because the libc crate is not a 100% complete binding to everything in the system headers, we can't replace every non-postgres API we currently have bindings for. Sadly, there's a big issue here, since "missing in libc" sometimes means "not possible to expose to Rust soundly" 😓, which we shouldn't patch over.

thomcc avatar Oct 04 '22 00:10 thomcc

Partially improved by #733 but this should really be reviewed several times before we close it.

workingjubilee avatar Oct 07 '22 06:10 workingjubilee

I noticed that we sometimes change the signature of functions when we wrap them. For example: https://docs.rs/pgx-pg-sys/latest/pgx_pg_sys/fn.pg_fprintf.html -- the real function should be pg_fprintf(*mut FILE, *const c_char, ...). The reason it's that it's impossible to wrap:

  1. Rust cannot write C-style variadic functions (functions that take ...) stablely.
  2. C cannot write variadic wrapper functions (that take ...) and then forward those arguments to another function with ... (it would need to use a va_list).

The reason this happens is in https://github.com/tcdi/pgx/blob/b92c60974c203e07e05cf1320581084be20a8e8a/pgx-utils/src/rewriter.rs#L150 and https://github.com/tcdi/pgx/blob/master/pgx-utils/src/rewriter.rs#L171. Essentially, if an argument is of an unknown format, we skip it in the rewriter rather than emitting any sort of error. The variadic arguments come through as Pat::Verbatim with a ... token. We should emit an error in this case.


Similarly, this impacts the #[pg_guard] macro, and means something like this:

#[pg_guard]
extern "C" {
    fn foo(_: c_int, name: *const c_char) -> c_int;
}

will generate a wrapper that silently skips the first argument.

thomcc avatar Oct 12 '22 14:10 thomcc

Regarding #760

This is a note to act as a reminder to re-visit .allowlist_* for bindgen. See that issue for more details.

BradyBonnette avatar Oct 13 '22 17:10 BradyBonnette