SynapseML icon indicating copy to clipboard operation
SynapseML copied to clipboard

[BUG] Fail to run the Python built from source

Open luyanaa opened this issue 9 months ago • 0 comments

SynapseML version

'0.0.0-1659-1b5df703-SNAPSHOT'

System information

  • Language version (e.g. python 3.8, scala 2.12): Python 3.11, Scala version 2.12.17
  • Spark Version (e.g. 3.2.3): 3.4.1
  • Spark Platform (e.g. Synapse, Databricks): PySpark

Describe the problem

When following https://microsoft.github.io/SynapseML/docs/Reference/Developer%20Setup/ to build SynapseML from source, "'JavaPackage' object is not callable" is reported, and if manually copy jars generated in target/scala-2.12 to site-packages/pyspark/jars, "py4j.protocol.Py4JNetworkError: Answer from Java side is empty" would be reported.

Code to reproduce issue

from synapse.ml.lightgbm import *
LightGBMRegressor()

Other info / logs

Exception in thread "Thread-26" java.lang.NoClassDefFoundError: spray/json/JsonFormat
        at java.base/java.lang.Class.forName0(Native Method)
        at java.base/java.lang.Class.forName(Class.java:398)
        at py4j.reflection.CurrentThreadClassLoadingStrategy.classForName(CurrentThreadClassLoadingStrategy.java:40)
        at py4j.reflection.ReflectionUtil.classForName(ReflectionUtil.java:51)
        at py4j.reflection.TypeUtil.forName(TypeUtil.java:243)
        at py4j.commands.ReflectionCommand.getUnknownMember(ReflectionCommand.java:175)
        at py4j.commands.ReflectionCommand.execute(ReflectionCommand.java:87)
        at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
        at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
        at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.ClassNotFoundException: spray.json.JsonFormat
        at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581)
        at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
        at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:527)
        ... 10 more
ERROR:root:Exception while sending command.
Traceback (most recent call last):
  File "/home/luyanaa/miniforge3/envs/synapseml/lib/python3.11/site-packages/pyspark/python/lib/py4j-0.10.9.7-src.zip/py4j/clientserver.py", line 516, in send_command
    raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/luyanaa/miniforge3/envs/synapseml/lib/python3.11/site-packages/pyspark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py", line 1038, in send_command
    response = connection.send_command(command)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/luyanaa/miniforge3/envs/synapseml/lib/python3.11/site-packages/pyspark/python/lib/py4j-0.10.9.7-src.zip/py4j/clientserver.py", line 539, in send_command
    raise Py4JNetworkError(
py4j.protocol.Py4JNetworkError: Error while sending or receiving
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/luyanaa/miniforge3/envs/synapseml/lib/python3.11/site-packages/pyspark/__init__.py", line 139, in wrapper
    return func(self, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/luyanaa/miniforge3/envs/synapseml/lib/python3.11/site-packages/synapse/ml/featurize/text/TextFeaturizer.py", line 106, in __init__
    self._java_obj = self._new_java_obj("com.microsoft.azure.synapse.ml.featurize.text.TextFeaturizer", self.uid)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/luyanaa/miniforge3/envs/synapseml/lib/python3.11/site-packages/pyspark/ml/wrapper.py", line 84, in _new_java_obj
    java_obj = getattr(java_obj, name)
               ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/luyanaa/miniforge3/envs/synapseml/lib/python3.11/site-packages/pyspark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py", line 1664, in __getattr__
py4j.protocol.Py4JError: com.microsoft.azure.synapse.ml.featurize.text.TextFeaturizer does not exist in the JVM

What component(s) does this bug affect?

  • [ ] area/cognitive: Cognitive project
  • [x] area/core: Core project
  • [ ] area/deep-learning: DeepLearning project
  • [x] area/lightgbm: Lightgbm project
  • [ ] area/opencv: Opencv project
  • [ ] area/vw: VW project
  • [ ] area/website: Website
  • [ ] area/build: Project build system
  • [ ] area/notebooks: Samples under notebooks folder
  • [ ] area/docker: Docker usage
  • [ ] area/models: models related issue

What language(s) does this bug affect?

  • [ ] language/scala: Scala source code
  • [x] language/python: Pyspark APIs
  • [ ] language/r: R APIs
  • [ ] language/csharp: .NET APIs
  • [ ] language/new: Proposals for new client languages

What integration(s) does this bug affect?

  • [ ] integrations/synapse: Azure Synapse integrations
  • [ ] integrations/azureml: Azure ML integrations
  • [ ] integrations/databricks: Databricks integrations

luyanaa avatar Feb 21 '25 14:02 luyanaa