pypmml-spark
pypmml-spark copied to clipboard
TypeError: 'JavaPackage' object is not callable error despite linking jars into spark succesfully
I have run the link_pmml4s_jars_into_spark.py script succesfully

and pmml4s jar files are present in SPARK_HOME location

However, TypeError: 'JavaPackage' object is not callable still occurs

I am running Java Version=1.8.0_302 and Spark Version=3.2.1.
I would kindly appreciate any suggestion what is missing.
@NatMzk The ScoreModel.fromFile() expects a local pathname of the model, could you use other methods like fromBytes or fromString to load the model? so first you should read the model from the path dbfs:/... by yourself.
from my understanding dbfs path is databricks's local path where the XML pmml model is located. I tried using fromBytes and fromString methods but it results with the same error.
@NatMzk Could you provide the full stack of the exception above? and try restarting the kernel before load model
I restarted kernel by Deattaching & Reattaching notebook with no results. Error trace is as following:

I am running Databricks Runtime Version 10.4 LTS on single node cluster (not pure spark). Apache Spark=3.2.1 Java Version=1.8.0_302 (Azul Systems, Inc.)
I don't have the Databricks Runtime, but when I remove the links created by the script link_pmml4s_jars_into_spark.py, I can reproduce the same error on my side, so I guess your issue could be caused by the same reason that those dependent jars of pmml4s are not found by Spark, there some several ways to try:
For details about the following configurations, see the official doc: https://spark.apache.org/docs/latest/configuration.html
All those ones can be specified by the conf file or the command line, check the doc for your eivironment. Take the command line to launch pyspark as an example:
- set
spark.jars,
pyspark --conf spark.jars="$(echo /Path/To/pypmml_spark/jars/*.jar | tr ' ' ',')"
- set
spark.jars.packages
pyspark --conf spark.jars.packages=org.pmml4s:pmml4s_2.12:0.9.16,org.pmml4s:pmml4s-spark_2.12:0.9.16,io.spray:spray-json_2.12:1.3.5,org.apache.commons:commons-math3:3.6.1
- set
spark.driver.extraClassPathandspark.executor.extraClassPath
pyspark --conf spark.driver.extraClassPath="/Path/To/pypmml_spark/jars/*" --conf spark.executor.extraClassPath="/Path/To/pypmml_spark/jars/*"
Recommend the options 1 and 2
@NatMzk Did the methods above resolve your issue?
Another relatively simple way for Databricks is to copy the jar files to /databricks/jars, for example in a cluster install script.