ClickHouse-Native-JDBC icon indicating copy to clipboard operation
ClickHouse-Native-JDBC copied to clipboard

PySpark hanging using this connector

Open ruiyang2015 opened this issue 3 years ago • 2 comments

using the official one, seems working trying to switch to the native jdbc, using pyspark to read it, it is hanging without showing any error message: only saw following log

I1009 02:53:09.454974 46 jdbc.py:161] open connection {self._host} {self._database} {self._port} log4j:WARN No appenders could be found for logger (com.github.housepower.jdbc.ClickhouseJdbcUrlParser). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

here is the pyspark snippet:

   pgDF=spark.read.format('jdbc').option('driver', 'com.github.housepower.jdbc.ClickHouseDriver').option('url', 'jdbc:clickhouse://127.0.0.1:9000').option('user',user).option('password',password).option('dbtable',dbtable).load()

also tried to use the python JayDeBeAPI to connect to the db, also hanging.

the same code works with the official driver, wondering if I miss anything here. the official is a bit too big in size, this one seems quite small and does not require as many dep, if this works, really want to switch to this driver.

ruiyang2015 avatar Oct 09 '21 03:10 ruiyang2015

Sorry, not familiar with pySpark.

the official is a bit too big in size

The official driver also provides shaded jars

https://repo1.maven.org/maven2/ru/yandex/clickhouse/clickhouse-jdbc/0.3.1-patch/clickhouse-jdbc-0.3.1-patch-shaded.jar

<dependency>
    <groupId>ru.yandex.clickhouse</groupId>
    <artifactId>clickhouse-jdbc</artifactId>
    <version>0.3.1-patch</version>
    <classfier>shaded</classfier>
</dependency>

pan3793 avatar Oct 09 '21 03:10 pan3793

thanks, will give that a try.

ruiyang2015 avatar Oct 09 '21 04:10 ruiyang2015