Make pg_duckdb work with vanilla libduckdb.so
What happens?
I built and installed pg_duckdb, then downloaded the latest binary release, libduckdb-linux-amd64.zip, and put libduckdb.so into /usr/local/lib so that the OS would find it in the ld paths when Postgres requested it. It failed with this error:
FATAL: could not load library "/var/lib/postgresql/data/15/lib/pg_duckdb.so": /var/lib/postgresql/data/15/lib/pg_duckdb.so: undefined symbol: _ZNK6duckdb28SimpleNamedParameterFunction8ToStringB5cxx11Ev
So I replaced /usr/local/lib/libduckdb.so with the file from pg_config --pkglibdir (the OS cannot find it there, since it's not in the ld paths, so I don't understand why make install puts it there). That fixed the issue. Yay!
Except that now the other DuckDB extension I'm working with, alitrack/duckdb_fdw, doesn't work with the libduckdb.so compiled by pg_duckdb (and I do have a symlink from libduckdb.1.1.3.so so that duckdb_fdw is happy). It complains:
ERROR: could not load library "/var/lib/postgresql/data/15/lib/duckdb_fdw.so": /var/lib/postgresql/data/15/lib/duckdb_fdw.so: undefined symbol: _ZN6duckdb9Exception25ConstructMessageRecursiveERKSsRSt6vectorINS_20ExceptionFormatValueESaIS4_EE
Maddening.
Would it be possible to get pg_duckdb to work with the binary release of libduckdb.so? If not, is there some build configuration to get everything included? It seems like they ought to be, honestly, since ConstructMessageRecursive has been around for 2 years, and SimpleNamedParameter for four years.
To Reproduce
RUN git clone --depth 1 --branch "v0.2.0" https://github.com/duckdb/pg_duckdb
make -C pg_duckdb install
curl -LO https://github.com/duckdb/duckdb/releases/download/v1.1.3/libduckdb-linux-amd64.zip
unzip -d . libduckdb-linux-amd64.zip libduckdb.so
install -m 755 libduckdb.so /usr/local/lib/libduckdb.so
(cd /usr/local/lib && ln -s libduckdb.so libduckdb.1.1.3.so)
# Add pg_duckdb to shared_preload_libraries and start Postgres
OS:
Linux
pg_duckdb Version (if built from source use commit hash):
0.2.0
Postgres Version (if built from source use commit hash):
15.10
Hardware:
No response
Full Name:
David Wheeler
Affiliation:
Tembo
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have tested with a stable release
Did you include all relevant data sets for reproducing the issue?
Not applicable - the reproduction does not require a data set
Did you include all code required to reproduce the issue?
- [x] Yes, I have
Did you include all relevant configuration (e.g., CPU architecture, Linux distribution) to reproduce the issue?
- [x] Yes, I have
Hi @theory, thanks for reporting the issue. Out of curiosity, why are you trying to bring the DuckDB library from https://github.com/duckdb/duckdb/releases/ ?
make install in the pg_duckdb root directory should build and install it.
That doesn't work for you?
You can also take a look at our Dockerfile if it helps.
Out of curiosity, why are you trying to bring the DuckDB library from https://github.com/duckdb/duckdb/releases/ ?
Because pg_duckdb is not the only extension that needs to use libduckdb.so. One example is alitrack/duckdb_fdw, which works great with the binary-distributed libduckdb.so, but fails with the pg_duckdb-compiled libduckdb.so with this error:
ERROR: could not load library "/var/lib/postgresql/data/15/lib/duckdb_fdw.so": /var/lib/postgresql/data/15/lib/duckdb_fdw.so: undefined symbol: _ZN6duckdb9Exception25ConstructMessageRecursiveERKSsRSt6vectorINS_20ExceptionFormatValueESaIS4_EE
Hmm. Currently we compile a slightly custom build of duckdb. Due to the http request caching we support. It would definitely be nice to support running on an unmodified version, but that probably requires some changes to some changes to move the http caching to pg_duckdb (or at least detect if we run with an unmodified version so that we can disable the caching logic).
It seems kinda weird that our libduckdb.so does not have that _ZN6duckdb9Exception25ConstructMessageRecursiveERKSsRSt6vectorINS_20ExceptionFormatValueESaIS4_EE symbol though. Maybe we're missing a compilation flag or something.
That's my guess as well, though I didn't see anything obvious at a glance. Equally weird that _ZNK6duckdb28SimpleNamedParameterFunction8ToStringB5cxx11Ev is missing from the binary distribution. It's super odd TBH.
@theory FYI: With #618 we've added support to build DuckDB statically into pg_duckdb. That's obviously very different than working with upstream DuckDB, it should hopefully still help with your packaging conflicts.
Yes, that should solve the conflict, though it should be fun to see an instance that has pg_duckdb (static), duckdb_fdw (DSO), and pg_analytics (Rust) installed at the same time Lotta duplicate code.
pg_duckdb should work fine with a vanilla duckdb shared library nowadays, because we deleted our custom caching code in #644. If that's not the case, please share some more details after trying it out.
It still requires a specific version of duckdb though (on current master that is 1.2.1), so you might still have the issues that other extensions that use duckdb depend on a different versions. So likely you'll want to do something similar to what Neon did in this PR: https://github.com/neondatabase/neon/pull/10915
As explain above you can now also compile duckdb statically into pg_duckdb to avoid file conflicts using:
make -j20 DUCKDB_BUILD=ReleaseStatic
make install DUCKDB_BUILD=ReleaseStatic