ClickHouse
ClickHouse copied to clipboard
Postgres certificate error
Hi, I upgraded Clickhouse from 24.6 to 24.9 (using the operator)
A query using the postgresql function ( https://clickhouse.com/docs/en/sql-reference/table-functions/postgresql ) started failing after this upgrade:
HTTPDriver for http://clickhouse-clickhouse.my-clickhouse.svc.cluster.local:8123 returned response code 500)
Code: 614. DB::Exception: Try 2. Connection to `postgres.default.svc.cluster.local:5432` failed with error: connection to server at "postgres.default.svc.cluster.local" (10.100.187.138), port 5432 failed: could not open certificate file "/root/.postgresql/postgresql.crt": Permission denied
connection to server at "postgres.default.svc.cluster.local" (10.100.187.138), port 5432 failed: FATAL: no pg_hba.conf entry for host "172.16.210.196", user "elevate", database "elevate_develop", no encryption
I fixed this by setting the PGSSLCERT env var on the podTemplate.
ref: https://github.com/nginx/unit/issues/834#issuecomment-1410297997
Maybe some libpq upgrade in Clickhouse triggered this? Users should not need to set this env var I don't think.
👍🏻
As I am running in Kubernetes and wanted to avoid a restart I just created the directory /root/.postgresql and made it readable. Still just a workaround, was not needed before upgrading to 24.9.1.
We have a few remote tables using the PostgreSQL engine and encountered the same error after upgrading ClickHouse from 24.8.3.59-alpine to 24.9.2.42-alpine
@mausch W/A did help, thanks!
@rschu1ze it seems this might have been introduced in the recent libraries update (https://github.com/ClickHouse/ClickHouse/pull/69580 or https://github.com/ClickHouse/ClickHouse/pull/69741)
I indeed recently bumped Postgres-related libraries needed by the Postgres table function as the previous code dump was not maintainable
libpq
- https://github.com/ClickHouse/ClickHouse/pull/69564
- https://github.com/ClickHouse/ClickHouse/pull/69674
- https://github.com/ClickHouse/ClickHouse/pull/69741
libpqxx
- https://github.com/ClickHouse/ClickHouse/pull/69580
- (not merged) https://github.com/ClickHouse/ClickHouse/pull/69572
Above PRs were done quite mechanically: update the submodule in small version increments, re-apply (or drop) custom patches, check if tests fail. I am no PostgreSQL expert, I actually did not even setup the Postgres table engine by myself during this exercise. Obviously, none of the existing tests caught the issue (though, it was not proven yet that actually my PRs caused the problem.)
What would be helpful is a writeup of the reproduction steps, starting from a scratch ClickHouse and a scratch Postgres (again, I am PostgreSQL noob). Also, in case someone likes to bisect this: most PRs include smaller commits. Start at the bottom, revert them one-by-one and check if the problem persists.
EDIT: While writing this, I see https://github.com/nginx/unit/issues/834 which was referenced above. It says libpq 15.1 is to blame (which we get by https://github.com/ClickHouse/ClickHouse/pull/69674) and then the issue was closed. So does that mean that the behavior is expected? Is there anything to fix at all?
Is there anything to fix at all?
I think Clickhouse should either:
- make sure the process in the docker container has access to
/rootand/or/root/.postgresql - run the process in the docker container as a different user
@mausch Is it right to assume that you are using the official ClickHouse Docker image from dockerhub?
I think this still needs a writeup of the exact steps to reproduce this error ...
Is it right to assume that you are using the official ClickHouse Docker image from dockerhub?
Yep I'm using the default images managed by the Clickhouse operator.
I think this still needs a writeup of the exact steps to reproduce this error ...
Basically: run Clickhouse docker 24.9.2.42-alpine and attempt to connect to Postgres from it.
Alternatively you can install ClickHouse on a fresh Ubuntu machine link, and connect to a fresh Postgres installation @rschu1ze . In my test I used Postgres 15.
Or if you have an existing machine, the error is introduced when updating from 24.6.* to 24.9.2.42 (we did this 3 days ago in our test environment, and the error message seems to be the same).
Let me know if you want me to try something out, or to test something.
I am running 24.9.2.42 and can confirm the issue exists on this version. I have also attempted with 24.10.1.2812. I tried recreating the database and the error still exist.
temp fix, add the following to env variable:
env:
- name: PGSSLCERT
value: /tmp/postgresql.crt
I have similar and TBH i tried all options suggested in this thread. I have Postgres in RDS (AWS) and clickhouse 24.9.2.42 in docker container . I keep on getting this error:
port 5432 failed: could not stat private key file "/root/.postgresql/postgresql.key": Permission denied
and the query tried from clickhouse is this
insert into table select id, item_id, company_id, 1 as sign,toUnixTimestamp(modified_at) AS version from postgresql("host","database","table", "user","pass","schema", 'sslmode=verify-full;sslrootcert=/etc/clickhouse-client/ssl/postresql.crt')
NOTE: Tried without ssl mode in option as well
Does anyone know how we can fix this?
work around provided by yannick also did not work.
I looked into this issue and found out that (besides all the problems with defaults) we don't provide optional connection parameters neither with mysql() nor postgresql() functions, thought libraries we are using (libpqxx for postgres and mariadb lib for mysql) have these options in their APIs...
HI. This still is a problem in 25.1 both alpine and ubuntu.
What's the actual fix? Why does this happen?
Is there any fix for this? This is a real issue. It works on v24.3.18, but not on v25.1.5 - This is verified.
INSERT INTO db.table SELECT *
FROM postgresql('host', 'database', 'table', 'postgres', 'password');
Error:
Received exception from server (version 25.1.5):
Code: 614. DB::Exception: Received from clickhouse-db-0.clickhouse-db.clickhouse.svc.cluster.local:9000. DB::Exception: Try 2. Connection to `host:5432` failed with error: connection to server at "host" (10.2.4.238), port 5432 failed: could not open certificate file "/root/.postgresql/postgresql.crt": Permission denied
connection to server at "host" (10.2.4.238), port 5432 failed: FATAL: no pg_hba.conf entry for host "10.2.3.96", user "postgres", database "demo_prod", no encryption
. (POSTGRESQL_CONNECTION_FAILURE)
Sorry, at the risk of repeating myself. This issue needs a minimally reproducible reproducer with a step-by-step guide how to run into the problem (Kubernetes and setting environment variables in shady places as workaround doesn't count).
step-by-step guide how to run into the problem (Kubernetes and setting environment variables in shady places as workaround doesn't count).
I've already mentioned how to repro in https://github.com/ClickHouse/ClickHouse/issues/70365#issuecomment-2413122055 without Kubernetes or "setting environment variables in shady places" 🙂
To be super clear: setting environment variables is a workaround for the issue, not part of the repro.
Running ClickHouse in kubernetes. Upgraded from 24.8-LTS to 25.3-LTS and faced this issue. The env var workaround solved it.
Sorry, at the risk of repeating myself. This issue needs a minimally reproducible reproducer with a step-by-step guide how to run into the problem (Kubernetes and setting environment variables in shady places as workaround doesn't count).
What format would you like? I believe a few of us have attempted to share how we reproduce the issue. Let me know if I can be more concrete or help you in some way @rschu1ze.
Sorry, I don't find the time to work on this in soon future.
As soon as a workaround exists, and it is an upstream library issue, let's close.
I'm surprised by the response from the co-founder/CTO.
This issue stems from ClickHouse upgrading libpq without adequately testing for breaking changes. Whether the breaking change is documented or a bug in libpq is ultimately irrelevant. ClickHouse users interface with ClickHouse, not its dependencies. They expect it to work reliably.
While a workaround exists, the problem prevents users from running ClickHouse in Docker or Kubernetes environments and connecting to Postgres, arguably one of its most valuable integrations, unless they search and find this issue and bother to apply the workaround.
One of the main selling points of Clickhouse for us and likely many other companies is its ease of integration with Postgres and other data sources. Being able to run it locally is critical for prototyping and validating a proof of concept before eventually migrating to ClickHouse Cloud for production workloads. Leaving this issue unresolved risks losing potential customers, not just because of the technical problem, but also due to the perceived lack of support.
Thanks, adding "help wanted".
We were using an AWS Aurora Postgres DB with ClickHouse and came across this error trying to create a DB using the MaterializedPostgreSQL engine. Apparently this error can manifest in the case of insufficient permissions - we used the reader endpoint instead of the writer endpoint, and this engine needs to be able to create a replication slot. Using the writer endpoint solved the issue.
Just posting in case anyone else finds themselves here with a similar configuration.