pg_duckdb icon indicating copy to clipboard operation
pg_duckdb copied to clipboard

Make pg_duckdb work with vanilla libduckdb.so

Open theory opened this issue 11 months ago • 7 comments

What happens?

I built and installed pg_duckdb, then downloaded the latest binary release, libduckdb-linux-amd64.zip, and put libduckdb.so into /usr/local/lib so that the OS would find it in the ld paths when Postgres requested it. It failed with this error:

FATAL:  could not load library "/var/lib/postgresql/data/15/lib/pg_duckdb.so": /var/lib/postgresql/data/15/lib/pg_duckdb.so: undefined symbol: _ZNK6duckdb28SimpleNamedParameterFunction8ToStringB5cxx11Ev

So I replaced /usr/local/lib/libduckdb.so with the file from pg_config --pkglibdir (the OS cannot find it there, since it's not in the ld paths, so I don't understand why make install puts it there). That fixed the issue. Yay!

Except that now the other DuckDB extension I'm working with, alitrack/duckdb_fdw, doesn't work with the libduckdb.so compiled by pg_duckdb (and I do have a symlink from libduckdb.1.1.3.so so that duckdb_fdw is happy). It complains:

ERROR:  could not load library "/var/lib/postgresql/data/15/lib/duckdb_fdw.so": /var/lib/postgresql/data/15/lib/duckdb_fdw.so: undefined symbol: _ZN6duckdb9Exception25ConstructMessageRecursiveERKSsRSt6vectorINS_20ExceptionFormatValueESaIS4_EE

Maddening.

Would it be possible to get pg_duckdb to work with the binary release of libduckdb.so? If not, is there some build configuration to get everything included? It seems like they ought to be, honestly, since ConstructMessageRecursive has been around for 2 years, and SimpleNamedParameter for four years.

To Reproduce

RUN git clone --depth 1 --branch "v0.2.0" https://github.com/duckdb/pg_duckdb
make -C pg_duckdb install
curl -LO https://github.com/duckdb/duckdb/releases/download/v1.1.3/libduckdb-linux-amd64.zip
unzip -d . libduckdb-linux-amd64.zip libduckdb.so
install -m 755 libduckdb.so /usr/local/lib/libduckdb.so
(cd /usr/local/lib && ln -s libduckdb.so libduckdb.1.1.3.so)
# Add pg_duckdb to shared_preload_libraries and start Postgres

OS:

Linux

pg_duckdb Version (if built from source use commit hash):

0.2.0

Postgres Version (if built from source use commit hash):

15.10

Hardware:

No response

Full Name:

David Wheeler

Affiliation:

Tembo

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a stable release

Did you include all relevant data sets for reproducing the issue?

Not applicable - the reproduction does not require a data set

Did you include all code required to reproduce the issue?

  • [x] Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Linux distribution) to reproduce the issue?

  • [x] Yes, I have

theory avatar Jan 23 '25 23:01 theory

Hi @theory, thanks for reporting the issue. Out of curiosity, why are you trying to bring the DuckDB library from https://github.com/duckdb/duckdb/releases/ ?

make install in the pg_duckdb root directory should build and install it. That doesn't work for you?

You can also take a look at our Dockerfile if it helps.

Y-- avatar Jan 24 '25 08:01 Y--

Out of curiosity, why are you trying to bring the DuckDB library from https://github.com/duckdb/duckdb/releases/ ?

Because pg_duckdb is not the only extension that needs to use libduckdb.so. One example is alitrack/duckdb_fdw, which works great with the binary-distributed libduckdb.so, but fails with the pg_duckdb-compiled libduckdb.so with this error:

ERROR:  could not load library "/var/lib/postgresql/data/15/lib/duckdb_fdw.so": /var/lib/postgresql/data/15/lib/duckdb_fdw.so: undefined symbol: _ZN6duckdb9Exception25ConstructMessageRecursiveERKSsRSt6vectorINS_20ExceptionFormatValueESaIS4_EE

theory avatar Jan 24 '25 13:01 theory

Hmm. Currently we compile a slightly custom build of duckdb. Due to the http request caching we support. It would definitely be nice to support running on an unmodified version, but that probably requires some changes to some changes to move the http caching to pg_duckdb (or at least detect if we run with an unmodified version so that we can disable the caching logic).

It seems kinda weird that our libduckdb.so does not have that _ZN6duckdb9Exception25ConstructMessageRecursiveERKSsRSt6vectorINS_20ExceptionFormatValueESaIS4_EE symbol though. Maybe we're missing a compilation flag or something.

JelteF avatar Jan 24 '25 13:01 JelteF

That's my guess as well, though I didn't see anything obvious at a glance. Equally weird that _ZNK6duckdb28SimpleNamedParameterFunction8ToStringB5cxx11Ev is missing from the binary distribution. It's super odd TBH.

theory avatar Jan 24 '25 14:01 theory

@theory FYI: With #618 we've added support to build DuckDB statically into pg_duckdb. That's obviously very different than working with upstream DuckDB, it should hopefully still help with your packaging conflicts.

JelteF avatar Feb 21 '25 14:02 JelteF

Yes, that should solve the conflict, though it should be fun to see an instance that has pg_duckdb (static), duckdb_fdw (DSO), and pg_analytics (Rust) installed at the same time Lotta duplicate code.

theory avatar Feb 21 '25 18:02 theory

pg_duckdb should work fine with a vanilla duckdb shared library nowadays, because we deleted our custom caching code in #644. If that's not the case, please share some more details after trying it out.

It still requires a specific version of duckdb though (on current master that is 1.2.1), so you might still have the issues that other extensions that use duckdb depend on a different versions. So likely you'll want to do something similar to what Neon did in this PR: https://github.com/neondatabase/neon/pull/10915

As explain above you can now also compile duckdb statically into pg_duckdb to avoid file conflicts using:

make -j20 DUCKDB_BUILD=ReleaseStatic
make install DUCKDB_BUILD=ReleaseStatic

JelteF avatar Mar 21 '25 14:03 JelteF