Adding JDBC postgres driver connection for external db to environment / jupyter
What docker image you are using?
pyspark-notebook on ubuntu 18.04 server.
What complete docker command do you run to launch the container?
I am using a docker-compose.yaml, for three services: jupyter, spark-master, and spark-worker-1. For the spark services, I am running (approximately):
spark-master:
image: "pyspark-notebook"
command: /home/jovyan/start-spark.sh
volumes:
- /local/scratch-drive/:/scratch
- /local/work/:/usr/local/spark/work
spark-worker-1:
image: "pyspark-notebook"
command: /home/jovyan/start-spark-worker.sh
volumes:
- /local/scratch-drive/:/scratch
- /local/work/:/usr/local/spark/work
What steps do you take once the container is running to reproduce the issue?
I am not super familiar with java, but I have tried a variety of things. Predominantly, I have been trying to run the jaydebeapi python library inside jupyter and point it to the three driver .jar files (one primary and two dependencies), we'll call them driver.jar, dependency1.jar, and dependency2.jar. I have run the following (jclassname / IP url not real) to no avail:
import jaydebeapi
path "/path/with/three_driver_jars/"
conn = jaydebeapi.connect(jclassname='com.example.jdbc.Driver',
url= 'jdbc:https://0.0.0.0.0/sql:.',
driver_args=[user, pw],
jars=os.listdir(path))
What do you expect to happen?
I expect to create a connection. I have been able to compile simple .java scripts on a local windows machine to read out records, but I am unable to reconcile this jdbc configuration inside this docker configuration. I have tried moving the .jars to the JRE directory, editing the spark-defaults.conf (see example below) and adding "CLASSPATH" environment variables
(inside spark-defaults.conf)
spark.driver.extraClassPath /shared_jars/driver.jar:/shared_jars/dependency1.jar:/shared_jars/dependency2.jar
spark.executor.extraClassPath /shared_jars/driver.jar:/shared_jars/dependency1.jar:/shared_jars/dependency2.jar
What actually happens?
I am routinely confronted with the following error:
java.lang.RuntimeExceptionPyRaisable: java.lang.RuntimeException: Class com.example.jdbc.Driver not found
For more clarity, I have successfully configured the following windows script to output data after having compiled the three .jars into a class file and executing, so the driver and credentials seem to be OK.
public class ConnectTest{
public static void main(String[] args){
java.sql.Driver driver = new com.example.jdbc.Driver();
java.util.Properties info = new java.util.Properties();
info.put("user", username);
info.put("password", password);
java.sql.Connection conn = driver.connect("https://0.0.0.0.0/sql", info);
et cetera....
conn.close();
}
Am I missing something here? Is the java environment used for Spark incompatible or in need of modification? Or is this something I can possibly hack together inside of the current container state? I am very new to Java, but have some decent experience with python and docker.
We're starting to experiment with a general Q&A section on https://discourse.jupyter.org/c/questions to see if cross-technology questions like this one catch more attention from a broader community audience. You might try re-posting your question over there to see if someone with more experience in this topic can help.
If you do post the question again on the Discourse site, feel free to leave a link in a comment here for those that happen upon this closed issue.
Don't know if your problem is specifically that you don't have the jdbc driver installed. I had thir problem with MSSql Server JDBC driver. Tried a couple of things but nothing seemed to work (tried adding the jar in %%init_spark magic (I use scala), and some stuff like that.
Finally what I had to do was to manually copy the jar file of the JDBC to the CLASSPATH folder which defaults to something like /usr/local/spark/jars
I did this by using docker cp command to get the file from the host into the especific path in the container. After doing that I had no more problems with that driver.
@RobertSellers could you please tell us, were you able to resolve this issue? Did @raderas solution help you? Or maybe you are no longer interested in this issue?
Closing this one, since no response was received.