jaydebeapi icon indicating copy to clipboard operation
jaydebeapi copied to clipboard

Explicitly specifying a jdbc driver using the jars parameter is not working in a Jupyter notebook context

Open chris-ratcliffe opened this issue 7 years ago • 21 comments

(originally I asked this in issue #22, but then realized it was closed, so I am opening a new issue)

@baztian , I am also seeing a failure running inside a Jupyter notebook vs a standalone python script where it works fine. I can't seem to figure out the difference. Here is the code run in the notebook:

# create JDBC connection to the netezza database
import jaydebeapi

!echo 'JAVA_HOME: '$JAVA_HOME
print("Driver name: " + driverName + "\nDriver path: " + driverFileName)
!ls -altr $driverFileName
!ls -daltr $driverDir
print("jaydebeapi version: " + jaydebeapi.__version__)

mpwconn = jaydebeapi.connect(jclassname=driverName, url=connectionString, driver_args={'user': uid, 'password': pwd},jars=driverFileName)

JAVA_HOME: /opt/ibm/java-x86_64-80
Driver name: org.netezza.Driver
Driver path: /home/dsxuser/drivers/nzjdbc-3.jar
-rwxr-xr-x 1 dsxuser dsxuser 359254 Oct 16 16:55 /home/dsxuser/drivers/nzjdbc-3.jar
drwxr-xr-x 2 dsxuser dsxuser 4096 Oct 16 15:09 /home/dsxuser/drivers
jaydebeapi version: 1.1.1

---------------------------------------------------------------------------
java.lang.RuntimeExceptionPyRaisable      Traceback (most recent call last)
<ipython-input-37-03b494b3284a> in <module>()
     11 #!md5sum $driverFileName
     12 
---> 13 mpwconn = jaydebeapi.connect(jclassname=driverName, url=connectionString, driver_args={'user': uid, 'password': pwd},jars=driverFileName)
     14 
     15 curs = mpwconn.cursor()

/opt/conda/envs/DSX-Python35/lib/python3.5/site-packages/jaydebeapi/__init__.py in connect(jclassname, url, driver_args, jars, libs)
    379     else:
    380         libs = []
--> 381     jconn = _jdbc_connect(jclassname, url, driver_args, jars, libs)
    382     return Connection(jconn, _converters)
    383 

/opt/conda/envs/DSX-Python35/lib/python3.5/site-packages/jaydebeapi/__init__.py in _jdbc_connect_jpype(jclassname, url, driver_args, jars, libs)
    188             return jpype.JArray(jpype.JByte, 1)(data)
    189     # register driver for DriverManager
--> 190     jpype.JClass(jclassname)
    191     if isinstance(driver_args, dict):
    192         Properties = jpype.java.util.Properties

/opt/conda/envs/DSX-Python35/lib/python3.5/site-packages/jpype/_jclass.py in JClass(name)
     53     jc = _jpype.findClass(name)
     54     if jc is None:
---> 55         raise _RUNTIMEEXCEPTION.PYEXC("Class %s not found" % name)
     56 
     57     return _getClassFor(jc)

java.lang.RuntimeExceptionPyRaisable: java.lang.RuntimeException: Class org.netezza.Driver not found


curs = mpwconn.cursor()

Is there any other information that I can provide to shed some light on why it can't find the class?

Originally posted by @chris-ratcliffe in https://github.com/baztian/jaydebeapi/issues/22#issuecomment-430323797

chris-ratcliffe avatar Oct 16 '18 19:10 chris-ratcliffe

I came across the same issue and tried several things, after running these lines of codes, connection seems to be working for me:

import jaydebeapi as jdbc
import jpype
import os

jar = Driver_location_path
os.environ['JAVA_HOME'] =Java_path
os.environ['CLASSPATH'] =jar

args='-Djava.class.path=%s' % jar
jvm_path = jpype.getDefaultJVMPath()
jpype.startJVM(jvm_path, args)

Hope it solve the problem!

qcmgrt avatar Nov 02 '18 19:11 qcmgrt

@chris-ratcliffe were you able to resolve this issue? I have been trying to connect Netezza in a python3 and mac but always getting class not found issue.

sushmit86 avatar Nov 28 '18 23:11 sushmit86

@sushmit86 , are you using an ipython notebook? If so, what I found was that after you install the driver in the filesystem, you have to restart your kernel before the notebook can see the driver file and load it properly. Also, if you are pulling from multiple database connections, ie. making multiple jaydebeapi.connect() calls, the jars parameter in your first connection call must contain all of the paths to your jdbc jar files. Any jar parameter in subsequent connect() calls seems to be ignored.

chris-ratcliffe avatar Nov 29 '18 14:11 chris-ratcliffe

@chris-ratcliffe I tried connecting using a python shell terminal as well. But I still get the same error. Also, I am not pulling from multiple databases

sushmit86 avatar Dec 07 '18 02:12 sushmit86

Sorry, @sushmit86 , the two tricks I mentioned above got my issue resolved, so I am not sure what else might be the problem.

chris-ratcliffe avatar Dec 07 '18 13:12 chris-ratcliffe

Also, if you are pulling from multiple database connections, ie. making multiple jaydebeapi.connect() calls, the jars parameter in your first connection call must contain all of the paths to your jdbc jar files. Any jar parameter in subsequent connect() calls seems to be ignored.

took me a while to process what this meant, but this is what solved the issue for me (second call to connect was ignoring jars parameter, all jars need to be in first call to connect)

himynameschris avatar Dec 19 '18 12:12 himynameschris

I also had this problem, and in my case I found that the way I was referencing the JAR file in my Windows was not working properly. So this below works fine for me.

During debugging I figured out that jaydebeapi imports jpype and starts the jvm on first run. All Jars must be included at jvm startup, otherwise you have to shutdown jvm and restart it with the jars you want. Anyway this is why the JAR has to defined in the FIRST jaydebeapi.connect function call, otherwise it cannot be used even if included in subsequent calls.

In case of Jupyter notebook, this means you have to restart the kernel to get the proper behavior IF you did not execute your code in the right sequence.

import jaydebeapi as jdbc

jdbc_driver_name = "org.netezza.Driver"
conn = jdbc.connect(jdbc_driver_name
                    ,"jdbc:netezza://server/database"
                    ,driver_args={'user': sql_login, 'password': sql_password}
                    ,jars = "C:\\JDBC\\nzjdbc.jar")

I tried this code below to debug, but in the end I discovered that I didn't have to run this section at all to get the code to work. This forces the jvm to start with your jar, and I guess it would work in odd cases where the very first jaydebeapi.connect function call includes all the jars you want and the jvm still doesn't start with the right jars.

# Depending on your environment and other anomalies, jvm may start without loading the required JAR. 
# In that case uncomment and run the following code.

import jpype
driver_jar_path = 'C:\\JDBC\\nzjdbc.jar'
jvmPath = jpype.getDefaultJVMPath()
args='-Djava.class.path=%s' % driver_jar_path
jpype.startJVM(jvmPath, args)

# If this section worked, then the below will print a nice message.
# if the below errors, then something is broken and no matter what you try the connection will NOT work.
print(jpype.JClass('org.netezza.Driver'))

gafortiby avatar Apr 30 '19 20:04 gafortiby

I also have a similar problem. So, I want to connect to my driver for apache calcite with the following code:

`import jaydebeapi as jay

conn = jay.connect("cqp.driver.CqpDriver", "jdbc:cqp:", {'user': "admin", 'password': "admin", 'model': "/cqp/src/main/resources/model-three.json" }, "/cqp/target/cqp-1.0-SNAPSHOT.jar")`

And I get the following trace:

Traceback (most recent call last): File "makis.py", line 149, in "/home/makis/Frameworks/makis-fdex/cqp/target/cqp-1.0-SNAPSHOT.jar") File "/usr/local/lib/python3.6/dist-packages/jaydebeapi/init.py", line 381, in connect jconn = _jdbc_connect(jclassname, url, driver_args, jars, libs) File "/usr/local/lib/python3.6/dist-packages/jaydebeapi/init.py", line 190, in _jdbc_connect_jpype jpype.JClass(jclassname) File "/usr/local/lib/python3.6/dist-packages/jpype/_jclass.py", line 73, in JClass raise _RUNTIMEEXCEPTION.PYEXC("Class %s not found" % name) jpype._jexception.RuntimeExceptionPyRaisable: java.lang.RuntimeException: Class de.tu_berlin.cqp.driver.CqpDriver not found

So, even though I specify the path of the jar file where my driver class exists, I get the java.lang.RuntimeException: Class not found exception. This happens if I run from jupyter-notebook as well.

Any ideas?

spapadias avatar Aug 12 '19 12:08 spapadias

@sushmit86 , are you using an ipython notebook? If so, what I found was that after you install the driver in the filesystem, you have to restart your kernel before the notebook can see the driver file and load it properly. Also, if you are pulling from multiple database connections, ie. making multiple jaydebeapi.connect() calls, the jars parameter in your first connection call must contain all of the paths to your jdbc jar files. Any jar parameter in subsequent connect() calls seems to be ignored.

What do you mean by "install the driver in the filesystem"? It's a jar file..

diman82 avatar Oct 14 '20 09:10 diman82

@sushmit86 , are you using an ipython notebook? If so, what I found was that after you install the driver in the filesystem, you have to restart your kernel before the notebook can see the driver file and load it properly. Also, if you are pulling from multiple database connections, ie. making multiple jaydebeapi.connect() calls, the jars parameter in your first connection call must contain all of the paths to your jdbc jar files. Any jar parameter in subsequent connect() calls seems to be ignored.

What do you mean by "install the driver in the filesystem"? It's a jar file..

Correct, the JAR file. I just mean copy it into the filesystem into which your kernel was started from, then restart the kernel to ensure the ipython notebook can "see" the file in the filesystem before trying to reference it in the connect call.

To clarify, the notebook service I was using at the time was based on containers, so the filesystem changes were non-permanent, and I had to copy in any extra libraries I needed as part of the execution of the notebook. If your notebook runs in an environment where you can make persistent changes to the underlying filesystem, then you only need to copy it in once, and any time the kernel starts after that, you will be able to use the library.

chris-ratcliffe avatar Oct 14 '20 13:10 chris-ratcliffe

@chris-ratcliffe Can you please elaborate on where to copy the file? Or just provide an example of your own implemnataion? I just hold it in the same path where my notebook resides, and try to load it, as suggested by @QianyuCheng & @gafortiby

diman82 avatar Oct 14 '20 13:10 diman82

@diman82 , it doesn't matter where in your filesystem you copy it to. In my case, I put it on under $HOME/drivers I believe. The key is, after its put there, the kernel needs to be restarted (assuming it was already running) before a line like:

mpwconn = jaydebeapi.connect(jclassname=driverName, url=connectionString, driver_args={'user': uid, 'password': pwd},jars=driverFileName)

that specifies the path to driver file (driverFileName) will work.

chris-ratcliffe avatar Oct 14 '20 15:10 chris-ratcliffe

Also, if you are pulling from multiple database connections, ie. making multiple jaydebeapi.connect() calls, the jars parameter in your first connection call must contain all of the paths to your jdbc jar files. Any jar parameter in subsequent connect() calls seems to be ignored.

took me a while to process what this meant, but this is what solved the issue for me (second call to connect was ignoring jars parameter, all jars need to be in first call to connect)

This is, in my experience, almost always the issue: jpype (correctly) reuses the JVM process for subsequent connect requests — since Jupyter kernels are long-lived processes, the first call to connect will start the JVM and leave it running until the kernel is restarted. Once that happens the drivers are "baked in" and subsequent connect(jars=...) won't be used.

We had to modify our internal code to pass all the drivers that we could potentially use, the first time we call out to connect

joebeeson avatar Oct 14 '20 15:10 joebeeson

@diman82 , it doesn't matter where in your filesystem you copy it to. In my case, I put it on under $HOME/drivers I believe. The key is, after its put there, the kernel needs to be restarted (assuming it was already running) before a line like:

mpwconn = jaydebeapi.connect(jclassname=driverName, url=connectionString, driver_args={'user': uid, 'password': pwd},jars=driverFileName)

that specifies the path to driver file (driverFileName) will work.

@chris-ratcliffe What kernel are you refering to? Jupyter? I've restarted and closed pycharm numerous times - all in vain.

diman82 avatar Oct 14 '20 18:10 diman82

@joebeeson Attaching code fragment (Jupyter cell) and its output:

`#%%

import jpype jvmPath = jpype.getDefaultJVMPath() print(jpype.getDefaultJVMPath()) print(jpype.isJVMStarted()) print(jdbc_driver_loc) args='-Djava.class.path=%s' % jdbc_driver_loc jpype.addClassPath(jdbc_driver_loc)

if jpype.isJVMStarted():

jpype.startJVM(jvmPath, args)

If this section worked, then the below will print a nice message.

if the below errors, then something is broken and no matter what you try the connection will NOT work.

print(jpype.JClass('org.netezza.Driver'))`

Output:

` LoaJVM driver

import jpype jvmPath = jpype.getDefaultJVMPath() print(jpype.getDefaultJVMPath()) print(jpype.isJVMStarted()) print(jdbc_driver_loc) args='-Djava.class.path=%s' % jdbc_driver_loc jpype.addClassPath(jdbc_driver_loc)

if jpype.isJVMStarted():

jpype.startJVM(jvmPath, args)

If this section worked, then the below will print a nice message.

if the below errors, then something is broken and no matter what you try the connection will NOT work.

print(jpype.JClass('org.netezza.Driver')) C:\Program Files\Java\jre1.8.0_231\bin\server\jvm.dll True C:\dev\my-test-project\netezza-3.5.0.jar

TypeError Traceback (most recent call last)

in () 12 # If this section worked, then the below will print a nice message. 13 # if the below errors, then something is broken and no matter what you try the connection will NOT work. ---> 14 print(jpype.JClass('org.netezza.Driver')) 15

C:\Users\dm57337.conda\envs\py36\lib\site-packages\jpype_jclass.py in new(cls, jc, loader, initialize) 97 98 # Pass to class factory to create the type ---> 99 return _jpype._getClass(jc) 100 101

TypeError: Class org.netezza.Driver is not found`

diman82 avatar Oct 14 '20 18:10 diman82

Also do not underestimate the power of simply manually loading the driver in JPype.

import jpype
jpype.startJVM(classpath=[list of jar files to be loaded])
print(jpype.java.lang.System.out.println("java.class.path"))
print(jpype.JClass("driver.class.Name"))

If something you are trying to load does not appear in the print statement then it is not loaded (bad jar version, bad path relative to were the notebook started, etc). If the jar does appear in the class path but does not load when manually invoked, then there is likely a dependency missing or some other problem. The exact problem should appear in the exception message. (wrong Jar version, missing dependency, bad format, failure in initialization, etc)

Thrameos avatar Oct 14 '20 18:10 Thrameos

Eventually it was my silly bad - I download incorrect/misleading jar from here: http://www.java2s.com/Code/Jar/n/Downloadnetezza350jar.htm The correct jar is this: https://mvnrepository.com/artifact/org.netezza/nzjdbc/1.0

diman82 avatar Oct 14 '20 20:10 diman82

Hi @joebeeson @himynameschris @spapadias @sushmit86 @chris-ratcliffe

from pyspark.sql import SparkSession import findspark import jaydebeapi findspark.add_packages('mysql:mysql-connector-java:8.0.22') spark = SparkSession.builder.appName('local').getOrCreate() connection_String = 'jdbc:mysql://localhost:3306/employee?allowPublicKeyRetrieval=true&useSSL=false' jdbc_driver_name='com.mysql.jdbc.Driver' jarFile='C:\spark-2.3.4\jars\mysql-connector-java:8.0.22.jar' username='admin' password='admin123' conn = jaydebeapi.connect(jclassname=jdbc_driver_name, url=connection_String, driver_args={'user': username, 'password': password},jars=jarFile)

my code is this but i am getting an error like TypeError: Class com.mysql.jdbc.Driver is not found

i was so confused can anyone help me to resolve this

Thanks and Regards, Manikantha Sehar.

mady143 avatar Dec 04 '20 07:12 mady143

Hi @joebeeson @himynameschris @spapadias @sushmit86 @chris-ratcliffe

from pyspark.sql import SparkSession import findspark import jaydebeapi findspark.add_packages('mysql:mysql-connector-java:8.0.22') spark = SparkSession.builder.appName('local').getOrCreate() connection_String = 'jdbc:mysql://localhost:3306/employee?allowPublicKeyRetrieval=true&useSSL=false' jdbc_driver_name='com.mysql.jdbc.Driver' jarFile='C:\spark-2.3.4\jars\mysql-connector-java:8.0.22.jar' username='admin' password='admin123' conn = jaydebeapi.connect(jclassname=jdbc_driver_name, url=connection_String, driver_args={'user': username, 'password': password},jars=jarFile)

my code is this but i am getting an error like TypeError: Class com.mysql.jdbc.Driver is not found

i was so confused can anyone help me to resolve this

Thanks and Regards, Manikantha Sehar.

This error can yield from several sources:

  1. First thing first, check if driver_name is actually in jar file, using some java decompiler tool, I would recommend this one: https://github.com/java-decompiler/jd-gui/releases You may be loading an incorrect jar or referencing an incorrect class. I'm positive this is the cause of the error
  2. If didn't help, try loading jar with Jpype (there are examples of using Jpype in this thread, for complete information please check official documentation)

diman82 avatar Dec 04 '20 08:12 diman82

@chris-ratcliffe hey I facing exactly similar issue. Tried every bit. Can you help please.

nehaverma1142 avatar Jan 21 '21 20:01 nehaverma1142

For me the problem turned out to be that I was using a Linux agent while the code was built for Windows and that's why it couldn't find the driver. Switching to a Windows agent fixed it.

colinvanlieshout avatar Nov 11 '21 12:11 colinvanlieshout