ClickHouse-Native-JDBC
ClickHouse-Native-JDBC copied to clipboard
PySpark hanging using this connector
using the official one, seems working trying to switch to the native jdbc, using pyspark to read it, it is hanging without showing any error message: only saw following log
I1009 02:53:09.454974 46 jdbc.py:161] open connection {self._host} {self._database} {self._port} log4j:WARN No appenders could be found for logger (com.github.housepower.jdbc.ClickhouseJdbcUrlParser). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
here is the pyspark snippet:
pgDF=spark.read.format('jdbc').option('driver', 'com.github.housepower.jdbc.ClickHouseDriver').option('url', 'jdbc:clickhouse://127.0.0.1:9000').option('user',user).option('password',password).option('dbtable',dbtable).load()
also tried to use the python JayDeBeAPI to connect to the db, also hanging.
the same code works with the official driver, wondering if I miss anything here. the official is a bit too big in size, this one seems quite small and does not require as many dep, if this works, really want to switch to this driver.
Sorry, not familiar with pySpark.
the official is a bit too big in size
The official driver also provides shaded jars
https://repo1.maven.org/maven2/ru/yandex/clickhouse/clickhouse-jdbc/0.3.1-patch/clickhouse-jdbc-0.3.1-patch-shaded.jar
<dependency>
<groupId>ru.yandex.clickhouse</groupId>
<artifactId>clickhouse-jdbc</artifactId>
<version>0.3.1-patch</version>
<classfier>shaded</classfier>
</dependency>
thanks, will give that a try.