jaydebeapi
jaydebeapi copied to clipboard
Explicitly specifying a jdbc driver using the jars parameter is not working in a Jupyter notebook context
(originally I asked this in issue #22, but then realized it was closed, so I am opening a new issue)
@baztian , I am also seeing a failure running inside a Jupyter notebook vs a standalone python script where it works fine. I can't seem to figure out the difference. Here is the code run in the notebook:
# create JDBC connection to the netezza database
import jaydebeapi
!echo 'JAVA_HOME: '$JAVA_HOME
print("Driver name: " + driverName + "\nDriver path: " + driverFileName)
!ls -altr $driverFileName
!ls -daltr $driverDir
print("jaydebeapi version: " + jaydebeapi.__version__)
mpwconn = jaydebeapi.connect(jclassname=driverName, url=connectionString, driver_args={'user': uid, 'password': pwd},jars=driverFileName)
JAVA_HOME: /opt/ibm/java-x86_64-80
Driver name: org.netezza.Driver
Driver path: /home/dsxuser/drivers/nzjdbc-3.jar
-rwxr-xr-x 1 dsxuser dsxuser 359254 Oct 16 16:55 /home/dsxuser/drivers/nzjdbc-3.jar
drwxr-xr-x 2 dsxuser dsxuser 4096 Oct 16 15:09 /home/dsxuser/drivers
jaydebeapi version: 1.1.1
---------------------------------------------------------------------------
java.lang.RuntimeExceptionPyRaisable Traceback (most recent call last)
<ipython-input-37-03b494b3284a> in <module>()
11 #!md5sum $driverFileName
12
---> 13 mpwconn = jaydebeapi.connect(jclassname=driverName, url=connectionString, driver_args={'user': uid, 'password': pwd},jars=driverFileName)
14
15 curs = mpwconn.cursor()
/opt/conda/envs/DSX-Python35/lib/python3.5/site-packages/jaydebeapi/__init__.py in connect(jclassname, url, driver_args, jars, libs)
379 else:
380 libs = []
--> 381 jconn = _jdbc_connect(jclassname, url, driver_args, jars, libs)
382 return Connection(jconn, _converters)
383
/opt/conda/envs/DSX-Python35/lib/python3.5/site-packages/jaydebeapi/__init__.py in _jdbc_connect_jpype(jclassname, url, driver_args, jars, libs)
188 return jpype.JArray(jpype.JByte, 1)(data)
189 # register driver for DriverManager
--> 190 jpype.JClass(jclassname)
191 if isinstance(driver_args, dict):
192 Properties = jpype.java.util.Properties
/opt/conda/envs/DSX-Python35/lib/python3.5/site-packages/jpype/_jclass.py in JClass(name)
53 jc = _jpype.findClass(name)
54 if jc is None:
---> 55 raise _RUNTIMEEXCEPTION.PYEXC("Class %s not found" % name)
56
57 return _getClassFor(jc)
java.lang.RuntimeExceptionPyRaisable: java.lang.RuntimeException: Class org.netezza.Driver not found
curs = mpwconn.cursor()
Is there any other information that I can provide to shed some light on why it can't find the class?
Originally posted by @chris-ratcliffe in https://github.com/baztian/jaydebeapi/issues/22#issuecomment-430323797
I came across the same issue and tried several things, after running these lines of codes, connection seems to be working for me:
import jaydebeapi as jdbc
import jpype
import os
jar = Driver_location_path
os.environ['JAVA_HOME'] =Java_path
os.environ['CLASSPATH'] =jar
args='-Djava.class.path=%s' % jar
jvm_path = jpype.getDefaultJVMPath()
jpype.startJVM(jvm_path, args)
Hope it solve the problem!
@chris-ratcliffe were you able to resolve this issue? I have been trying to connect Netezza in a python3 and mac but always getting class not found issue.
@sushmit86 , are you using an ipython notebook? If so, what I found was that after you install the driver in the filesystem, you have to restart your kernel before the notebook can see the driver file and load it properly. Also, if you are pulling from multiple database connections, ie. making multiple jaydebeapi.connect() calls, the jars parameter in your first connection call must contain all of the paths to your jdbc jar files. Any jar parameter in subsequent connect() calls seems to be ignored.
@chris-ratcliffe I tried connecting using a python shell terminal as well. But I still get the same error. Also, I am not pulling from multiple databases
Sorry, @sushmit86 , the two tricks I mentioned above got my issue resolved, so I am not sure what else might be the problem.
Also, if you are pulling from multiple database connections, ie. making multiple jaydebeapi.connect() calls, the jars parameter in your first connection call must contain all of the paths to your jdbc jar files. Any jar parameter in subsequent connect() calls seems to be ignored.
took me a while to process what this meant, but this is what solved the issue for me (second call to connect was ignoring jars parameter, all jars need to be in first call to connect)
I also had this problem, and in my case I found that the way I was referencing the JAR file in my Windows was not working properly. So this below works fine for me.
During debugging I figured out that jaydebeapi imports jpype and starts the jvm on first run. All Jars must be included at jvm startup, otherwise you have to shutdown jvm and restart it with the jars you want. Anyway this is why the JAR has to defined in the FIRST jaydebeapi.connect function call, otherwise it cannot be used even if included in subsequent calls.
In case of Jupyter notebook, this means you have to restart the kernel to get the proper behavior IF you did not execute your code in the right sequence.
import jaydebeapi as jdbc
jdbc_driver_name = "org.netezza.Driver"
conn = jdbc.connect(jdbc_driver_name
,"jdbc:netezza://server/database"
,driver_args={'user': sql_login, 'password': sql_password}
,jars = "C:\\JDBC\\nzjdbc.jar")
I tried this code below to debug, but in the end I discovered that I didn't have to run this section at all to get the code to work. This forces the jvm to start with your jar, and I guess it would work in odd cases where the very first jaydebeapi.connect function call includes all the jars you want and the jvm still doesn't start with the right jars.
# Depending on your environment and other anomalies, jvm may start without loading the required JAR.
# In that case uncomment and run the following code.
import jpype
driver_jar_path = 'C:\\JDBC\\nzjdbc.jar'
jvmPath = jpype.getDefaultJVMPath()
args='-Djava.class.path=%s' % driver_jar_path
jpype.startJVM(jvmPath, args)
# If this section worked, then the below will print a nice message.
# if the below errors, then something is broken and no matter what you try the connection will NOT work.
print(jpype.JClass('org.netezza.Driver'))
I also have a similar problem. So, I want to connect to my driver for apache calcite with the following code:
`import jaydebeapi as jay
conn = jay.connect("cqp.driver.CqpDriver", "jdbc:cqp:", {'user': "admin", 'password': "admin", 'model': "/cqp/src/main/resources/model-three.json" }, "/cqp/target/cqp-1.0-SNAPSHOT.jar")`
And I get the following trace:
Traceback (most recent call last):
File "makis.py", line 149, in
So, even though I specify the path of the jar file where my driver class exists, I get the java.lang.RuntimeException: Class not found exception. This happens if I run from jupyter-notebook as well.
Any ideas?
@sushmit86 , are you using an ipython notebook? If so, what I found was that after you install the driver in the filesystem, you have to restart your kernel before the notebook can see the driver file and load it properly. Also, if you are pulling from multiple database connections, ie. making multiple jaydebeapi.connect() calls, the jars parameter in your first connection call must contain all of the paths to your jdbc jar files. Any jar parameter in subsequent connect() calls seems to be ignored.
What do you mean by "install the driver in the filesystem"? It's a jar file..
@sushmit86 , are you using an ipython notebook? If so, what I found was that after you install the driver in the filesystem, you have to restart your kernel before the notebook can see the driver file and load it properly. Also, if you are pulling from multiple database connections, ie. making multiple jaydebeapi.connect() calls, the jars parameter in your first connection call must contain all of the paths to your jdbc jar files. Any jar parameter in subsequent connect() calls seems to be ignored.
What do you mean by "install the driver in the filesystem"? It's a jar file..
Correct, the JAR file. I just mean copy it into the filesystem into which your kernel was started from, then restart the kernel to ensure the ipython notebook can "see" the file in the filesystem before trying to reference it in the connect call.
To clarify, the notebook service I was using at the time was based on containers, so the filesystem changes were non-permanent, and I had to copy in any extra libraries I needed as part of the execution of the notebook. If your notebook runs in an environment where you can make persistent changes to the underlying filesystem, then you only need to copy it in once, and any time the kernel starts after that, you will be able to use the library.
@chris-ratcliffe Can you please elaborate on where to copy the file? Or just provide an example of your own implemnataion? I just hold it in the same path where my notebook resides, and try to load it, as suggested by @QianyuCheng & @gafortiby
@diman82 , it doesn't matter where in your filesystem you copy it to. In my case, I put it on under $HOME/drivers I believe. The key is, after its put there, the kernel needs to be restarted (assuming it was already running) before a line like:
mpwconn = jaydebeapi.connect(jclassname=driverName, url=connectionString, driver_args={'user': uid, 'password': pwd},jars=driverFileName)
that specifies the path to driver file (driverFileName) will work.
Also, if you are pulling from multiple database connections, ie. making multiple jaydebeapi.connect() calls, the jars parameter in your first connection call must contain all of the paths to your jdbc jar files. Any jar parameter in subsequent connect() calls seems to be ignored.
took me a while to process what this meant, but this is what solved the issue for me (second call to connect was ignoring jars parameter, all jars need to be in first call to connect)
This is, in my experience, almost always the issue: jpype (correctly) reuses the JVM process for subsequent connect requests — since Jupyter kernels are long-lived processes, the first call to connect will start the JVM and leave it running until the kernel is restarted. Once that happens the drivers are "baked in" and subsequent connect(jars=...) won't be used.
We had to modify our internal code to pass all the drivers that we could potentially use, the first time we call out to connect
@diman82 , it doesn't matter where in your filesystem you copy it to. In my case, I put it on under $HOME/drivers I believe. The key is, after its put there, the kernel needs to be restarted (assuming it was already running) before a line like:
mpwconn = jaydebeapi.connect(jclassname=driverName, url=connectionString, driver_args={'user': uid, 'password': pwd},jars=driverFileName)that specifies the path to driver file (driverFileName) will work.
@chris-ratcliffe What kernel are you refering to? Jupyter? I've restarted and closed pycharm numerous times - all in vain.
@joebeeson Attaching code fragment (Jupyter cell) and its output:
`#%%
import jpype jvmPath = jpype.getDefaultJVMPath() print(jpype.getDefaultJVMPath()) print(jpype.isJVMStarted()) print(jdbc_driver_loc) args='-Djava.class.path=%s' % jdbc_driver_loc jpype.addClassPath(jdbc_driver_loc)
if jpype.isJVMStarted():
jpype.startJVM(jvmPath, args)
If this section worked, then the below will print a nice message.
if the below errors, then something is broken and no matter what you try the connection will NOT work.
print(jpype.JClass('org.netezza.Driver'))`
Output:
` LoaJVM driver
import jpype jvmPath = jpype.getDefaultJVMPath() print(jpype.getDefaultJVMPath()) print(jpype.isJVMStarted()) print(jdbc_driver_loc) args='-Djava.class.path=%s' % jdbc_driver_loc jpype.addClassPath(jdbc_driver_loc)
if jpype.isJVMStarted():
jpype.startJVM(jvmPath, args)
If this section worked, then the below will print a nice message.
if the below errors, then something is broken and no matter what you try the connection will NOT work.
print(jpype.JClass('org.netezza.Driver')) C:\Program Files\Java\jre1.8.0_231\bin\server\jvm.dll True C:\dev\my-test-project\netezza-3.5.0.jar
TypeError Traceback (most recent call last)
C:\Users\dm57337.conda\envs\py36\lib\site-packages\jpype_jclass.py in new(cls, jc, loader, initialize) 97 98 # Pass to class factory to create the type ---> 99 return _jpype._getClass(jc) 100 101
TypeError: Class org.netezza.Driver is not found`
Also do not underestimate the power of simply manually loading the driver in JPype.
import jpype
jpype.startJVM(classpath=[list of jar files to be loaded])
print(jpype.java.lang.System.out.println("java.class.path"))
print(jpype.JClass("driver.class.Name"))
If something you are trying to load does not appear in the print statement then it is not loaded (bad jar version, bad path relative to were the notebook started, etc). If the jar does appear in the class path but does not load when manually invoked, then there is likely a dependency missing or some other problem. The exact problem should appear in the exception message. (wrong Jar version, missing dependency, bad format, failure in initialization, etc)
Eventually it was my silly bad - I download incorrect/misleading jar from here: http://www.java2s.com/Code/Jar/n/Downloadnetezza350jar.htm The correct jar is this: https://mvnrepository.com/artifact/org.netezza/nzjdbc/1.0
Hi @joebeeson @himynameschris @spapadias @sushmit86 @chris-ratcliffe
from pyspark.sql import SparkSession import findspark import jaydebeapi findspark.add_packages('mysql:mysql-connector-java:8.0.22') spark = SparkSession.builder.appName('local').getOrCreate() connection_String = 'jdbc:mysql://localhost:3306/employee?allowPublicKeyRetrieval=true&useSSL=false' jdbc_driver_name='com.mysql.jdbc.Driver' jarFile='C:\spark-2.3.4\jars\mysql-connector-java:8.0.22.jar' username='admin' password='admin123' conn = jaydebeapi.connect(jclassname=jdbc_driver_name, url=connection_String, driver_args={'user': username, 'password': password},jars=jarFile)
my code is this but i am getting an error like TypeError: Class com.mysql.jdbc.Driver is not found
i was so confused can anyone help me to resolve this
Thanks and Regards, Manikantha Sehar.
Hi @joebeeson @himynameschris @spapadias @sushmit86 @chris-ratcliffe
from pyspark.sql import SparkSession import findspark import jaydebeapi findspark.add_packages('mysql:mysql-connector-java:8.0.22') spark = SparkSession.builder.appName('local').getOrCreate() connection_String = 'jdbc:mysql://localhost:3306/employee?allowPublicKeyRetrieval=true&useSSL=false' jdbc_driver_name='com.mysql.jdbc.Driver' jarFile='C:\spark-2.3.4\jars\mysql-connector-java:8.0.22.jar' username='admin' password='admin123' conn = jaydebeapi.connect(jclassname=jdbc_driver_name, url=connection_String, driver_args={'user': username, 'password': password},jars=jarFile)
my code is this but i am getting an error like TypeError: Class com.mysql.jdbc.Driver is not found
i was so confused can anyone help me to resolve this
Thanks and Regards, Manikantha Sehar.
This error can yield from several sources:
- First thing first, check if driver_name is actually in jar file, using some java decompiler tool, I would recommend this one: https://github.com/java-decompiler/jd-gui/releases You may be loading an incorrect jar or referencing an incorrect class. I'm positive this is the cause of the error
- If didn't help, try loading jar with Jpype (there are examples of using Jpype in this thread, for complete information please check official documentation)
@chris-ratcliffe hey I facing exactly similar issue. Tried every bit. Can you help please.
For me the problem turned out to be that I was using a Linux agent while the code was built for Windows and that's why it couldn't find the driver. Switching to a Windows agent fixed it.