postgresql_cluster icon indicating copy to clipboard operation
postgresql_cluster copied to clipboard

Add ParadeDB pg_search/pg_analytics

Open philippemnoel opened this issue 1 year ago • 6 comments

Hi there!

This project is super cool, thank you for making it. The HA situation in Postgres is frankly still quite complicated. I'm one of the makers of https://github.com/paradedb/paradedb.

We'd love to have pg_search (and perhaps pg_analytics) ship as part of postgresql_cluster, so that all users can benefit from it. We prebuild binaries, so it should be really straightforward. You can find them in our GitHub Releases.

Phil

philippemnoel avatar Aug 22 '24 21:08 philippemnoel

Hi @philippemnoel

Thank you for the feedback! Your project looks very promising, and I'll definitely take a closer look at it soon. I'm currently working on implementing full-text search for a large database and have run into the size limit of tsvector. Adding your extensions to the project shouldn't be difficult, and we'll make sure to do that.

vitabaks avatar Aug 22 '24 22:08 vitabaks

Done.

To deploy a PostgreSQL High-Availability Cluster with the ParadeDB extensions, add the enable_paradedb variable:

ansible-playbook deploy_pgcluster.yml -e "enable_paradedb=true"

vitabaks avatar Sep 17 '24 13:09 vitabaks

@philippemnoel this is great and I love ParadeDB. Unfortunately it doesn't work with the replicas as I get an error when reading from them. I'm assuming this is because of your Enterprise strategy and you only offer replication on Enterprise? If that's the case then it's misleading to enable it in the cluster as only the primary works.

herkulano avatar May 11 '25 18:05 herkulano

@herkulano What error are you getting?

vitabaks avatar May 12 '25 04:05 vitabaks

@vitabaks I get this error when I try to read from the replica:

Query 1 ERROR at Line 1: : ERROR:  could not read blocks 6..6 in file "base/5/17302": read only 0 of 8192 bytes

This error points to the pg_search index:

SELECT oid, relname, relkind FROM pg_class WHERE relfilenode = 17302;

There's a product_search_idx BM25 index in the replica, but it can't be queried. The query below gives the same error:

SELECT name, field_type FROM paradedb.schema('product_search_idx');

I'm assuming that @philippemnoel is blocking the ability to have replicas in the open source version of ParadeDB, given this on their website: Enterprise strategy. I just feel this is misleading because it doesn't fully work on a cluster, where the replicas can be promoted to primary and then the search is completely broken because the replica doesn't have the pg_search index data. It's also broken if you want to use the replicas for read operations, which I was trying to do.

herkulano avatar May 12 '25 08:05 herkulano

@herkulano Thank you for updating this topic, this is a really important change... In this form, the extension cannot be used in production.

vitabaks avatar May 12 '25 08:05 vitabaks

If you want to use pg_search in production with replication you need to pay for the enterprise version.

I'm closing it because it's not up to us.

vitabaks avatar Jun 30 '25 19:06 vitabaks