clickhouse-java Cannot acquire connection from data source while using HiKari Connection Pooling

Here is the stack trace The version of clickhouse-jdbc dependency is 0.1.46

Cannot acquire connection from data source
java.lang.IllegalStateException: cannot initialize http client
        at ru.yandex.clickhouse.ClickHouseConnectionImpl.<init>(ClickHouseConnectionImpl.java:73)
        at ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:58)
        at ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:50)
        at ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:32)
        at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:95)
        at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:316)
        at com.zaxxer.hikari.pool.PoolBase.newPoolEntry(PoolBase.java:173)
        at com.zaxxer.hikari.pool.HikariPool.createPoolEntry(HikariPool.java:431)
        at com.zaxxer.hikari.pool.HikariPool.access$500(HikariPool.java:66)
        at com.zaxxer.hikari.pool.HikariPool$PoolEntryCreator.call(HikariPool.java:574)
        at com.zaxxer.hikari.pool.HikariPool$PoolEntryCreator.call(HikariPool.java:567)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

I experienced this issue every now and then, after a few hours the pool runs out of connections caused the application to crash. From the stack trace, you can see that the exception is thrown by ClickHouseConnectionImpl. It works fine when connecting directly to clickhouse DB using the Clickhouse DataSource.

This is how we used to make a connection to clickhouse db( The data source was rebuilt every time for a new connection)

ClickHouseDataSource cds=ClickHouseDataSource(jdbcUrl , 
getClickHouseProperties(user,password,socketTimeout));
Connection con= cds.getConnection();

Now we use Hikari pooling,

HikariConfig config = new HikariConfig();
config.setJdbcUrl(jdbcUrl);
config.setDataSourceProperties(getClickHouseProperties(user, password, socketTimeout));
config.setPoolName("clickhouse-pool-1");
config.setDriverClassName("ru.yandex.clickhouse.ClickHouseDriver");
config.setMaximumPoolSize(connectionCount);
HikariDataSource dataSource = new HikariDataSource(config);

Here we reuse the data source every time.

Jan 25 '20 09:01 abykuruvilla-dev

ClickHouse internally uses PoolingHttpClientConnectionManager for setting up the HttpClient which is the base of the ClickHouse Datasource. Unless we create the Datasource object over and over again it internally creates a connection pool and it is used for all HTTP request it builds for each JDBC connection over that datasource. The below is the code excerpt from ClickHouseHttpClientBuilder which is in turn called from ClickHouseConnectionImpl which is again called when you create ClickHouseDatasource

`public CloseableHttpClient buildClient() throws Exception {
    return HttpClientBuilder.create()
            .setConnectionManager(getConnectionManager())
            .setKeepAliveStrategy(createKeepAliveStrategy())
            .setDefaultConnectionConfig(getConnectionConfig())
            .setDefaultRequestConfig(getRequestConfig())
            .setDefaultHeaders(getDefaultHeaders())
            .disableContentCompression() // gzip здесь ни к чему. Используется lz4 при compress=1
            .disableRedirectHandling()
            .build();
}`

`private PoolingHttpClientConnectionManager getConnectionManager()
    throws CertificateException, NoSuchAlgorithmException, KeyStoreException, KeyManagementException, IOException {
    RegistryBuilder<ConnectionSocketFactory> registry = RegistryBuilder.<ConnectionSocketFactory>create()
      .register("http", PlainConnectionSocketFactory.getSocketFactory());

    if (properties.getSsl()) {
        HostnameVerifier verifier = "strict".equals(properties.getSslMode()) ? SSLConnectionSocketFactory.getDefaultHostnameVerifier() : NoopHostnameVerifier.INSTANCE;
        registry.register("https", new SSLConnectionSocketFactory(getSSLContext(), verifier));
    }

    //noinspection resource
    PoolingHttpClientConnectionManager connectionManager = new PoolingHttpClientConnectionManager(
        registry.build(),
        null,
        null,
        new IpVersionPriorityResolver(),
        properties.getTimeToLiveMillis(),
        TimeUnit.MILLISECONDS
    );

    connectionManager.setDefaultMaxPerRoute(properties.getDefaultMaxPerRoute());
    connectionManager.setMaxTotal(properties.getMaxTotal());
    connectionManager.setDefaultConnectionConfig(getConnectionConfig());
    return connectionManager;
}`

May 26 '20 19:05 shadman-reza

Have you configured defaultMaxPerRoute(default max connection per route) and maxTotal(default max connection)? Defaults can be found in ClickHouseConnectionSettings:

    DEFAULT_MAX_PER_ROUTE("defaultMaxPerRoute", 500, ""),
    MAX_TOTAL("maxTotal", 10000, ""),

Perhaps we should disable http client pooling by default, so that we don't have to configure two pools(Hikari and HttpClient in this case) for one simple thing.

Feb 03 '21 06:02 zhicwu

Excuse me, what is the viable solution at this point? How does one configure this when using Spring Data JDBC?

Oct 15 '21 14:10 isnikulin

Excuse me, what is the viable solution at this point? How does one configure this when using Spring Data JDBC?

You can't. At this point, ClickHouse is not supported by Hibernate. The best bet is to use JDBCTemplate, initialize the ClickhouseDataSource and then pass it to the constructor of the JDBC template. This will work like a charm. Another long shot is to enable MySQL mode on ClickHouse and then use MySQL driver to access ClickHouse. There are a few limitations to this if I remember correctly(some DDLs are not supported).

Oct 15 '21 14:10 abykuruvilla-dev

Thank you! Your advice has helped immensely.

Oct 18 '21 13:10 isnikulin

Hi, got a question about ClickhouseDatasource. You've said:

ClickHouse internally uses PoolingHttpClientConnectionManager for setting up the HttpClient which is the base of the ClickHouse Datasource.

and should be used without external connection pools like hikari. I'm testing that on a simple app that opens two connections from the same DataSource and pings the db in each of them. And It looks like the HttpClient and it's pool are created per ClickhouseConnectionImpl, not per ClickHouseDatasource.

A code like this (scala):

  def ping(): Boolean = {
    val res =
      Using(ds.getConnection) { connection1 =>
        Using(ds.getConnection) { connection2 =>
          val res1 = Try {
            connection1.isValid(0)
          }.getOrElse(false)
          val res2 = Try {
            connection2.isValid(0)
          }.getOrElse(false)
          res1 || res2
        }.getOrElse(false)
      }.getOrElse(false)
    reportResult(res)
    res
  }

Leaves this trace for org.apache.http.impl.conn.PoolingHttpClientConnectionManager:

  2022-02-09 21:53:33.117 Connection request: [route: {s}->myHost:8443][total available: 0; route allocated: 0 of 5; total allocated: 0 of 10000]   
  2022-02-09 21:53:33.117 Connection leased: [id: 775][route: {s}->myHost:8443][total available: 0; route allocated: 1 of 5; total allocated: 1 of 10000]   
  2022-02-09 21:53:33.138 Connection released: [id: 775][route: {s}->myHost:8443][total available: 1; route allocated: 1 of 5; total allocated: 1 of 10000]   
  2022-02-09 21:53:33.138 Connection [id: 775][route: {s}->myHost:8443] can be kept alive for 3.0 seconds   

  2022-02-09 21:53:33.139 Connection request: [route: {s}->myHost:8443][total available: 0; route allocated: 0 of 5; total allocated: 0 of 10000]   
  2022-02-09 21:53:33.140 Connection leased: [id: 778][route: {s}->myHost:8443][total available: 0; route allocated: 1 of 5; total allocated: 1 of 10000]   
  2022-02-09 21:53:33.159 Connection released: [id: 778][route: {s}->myHost:8443][total available: 1; route allocated: 1 of 5; total allocated: 1 of 10000]   
  2022-02-09 21:53:33.159 Connection [id: 778][route: {s}->myHost:8443] can be kept alive for 3.0 seconds   

  2022-02-09 21:53:33.160 Connection request: [route: {s}->myHost:8443][total available: 1; route allocated: 1 of 5; total allocated: 1 of 10000]   
  2022-02-09 21:53:33.160 Connection leased: [id: 775][route: {s}->myHost:8443][total available: 0; route allocated: 1 of 5; total allocated: 1 of 10000]   
  2022-02-09 21:53:33.165 Connection released: [id: 775][route: {s}->myHost:8443][total available: 1; route allocated: 1 of 5; total allocated: 1 of 10000]   
  2022-02-09 21:53:33.165 Connection [id: 775][route: {s}->myHost:8443] can be kept alive for 3.0 seconds   

  2022-02-09 21:53:33.165 Connection request: [route: {s}->myHost:8443][total available: 1; route allocated: 1 of 5; total allocated: 1 of 10000]   
  2022-02-09 21:53:33.165 Connection leased: [id: 778][route: {s}->myHost:8443][total available: 0; route allocated: 1 of 5; total allocated: 1 of 10000]   
  2022-02-09 21:53:33.170 Connection released: [id: 778][route: {s}->myHost:8443][total available: 1; route allocated: 1 of 5; total allocated: 1 of 10000]   
  2022-02-09 21:53:33.170 Connection [id: 778][route: {s}->myHost:8443] can be kept alive for 3.0 seconds   

  2022-02-09 21:53:33.170 Connection manager is shutting down
  2022-02-09 21:53:33.170 Connection manager is shutting down
  2022-02-09 21:53:33.171 Connection manager shut down
  2022-02-09 21:53:33.170 Connection manager shut down

Here I have two connection pools not sharing their http connections with each other. Each has two requests (the first one being select timezone() on connection initialization, as I'm using default settings), which do use connections' pools. But that means keeping the same ClickHouseConnection for the application, not ClickHouseDataSource, and sharing ClickHouseConnection does not look nice, because is has a state (e.g. could setCatalog on it).

So it looks like you do need a hikari pool to work with clickhouse-jdbc? I'm using version 0.2.6 of the driver, this issue probably references 0.2.5, but there should not be much of a difference.

Feb 09 '22 19:02 reimai

And It looks like the HttpClient and it's pool are created per ClickhouseConnectionImpl, not per ClickHouseDatasource.

Correct. ClickHouseConnectionImpl holds CloseableHttpClient which uses PoolingHttpClientConnectionManager to manage connections.

But that means keeping the same ClickHouseConnection for the application, not ClickHouseDataSource, and sharing ClickHouseConnection does not look nice, because is has a state (e.g. could setCatalog on it).

Yes. ClickHouseDataSource is only used to create ClickHouseConnection. It doesn't make sense to share same ClickHouseConnection, although it works because of the HttpClient pool behind the scene.

So it looks like you do need a hikari pool to work with clickhouse-jdbc?

Yes, if you want to 1) validate connection before execution; and 2) limit connections between client and server. In general you should start with small pool size and increase based your needs and observation.

Please keep in mind that implementation may change over time. Since 0.3.2, Apache HttpClient was replaced by HttpURLConnection, which has its own way to manage underlying socket. Moreover, please be aware that ClickHouse supports multiple protocols. Besides HTTP, clickhouse-jdbc now supports gRPC, and Native/TCP will be added in next release.

Feb 10 '22 00:02 zhicwu