SynapseML icon indicating copy to clipboard operation
SynapseML copied to clipboard

Error installing synapseml

Open shazriz opened this issue 2 years ago • 6 comments

SynapseML version

Installation issue

System information

  • Language version (python 3.7, scala 2.12.10):
  • Spark Version ( 3.1.2):
  • Spark Platform (pyspark): Currently no cloud integration.

Describe the problem

:: loading settings :: url = jar:file:/app/ide/anaconda/envs/telematics_env_202211/lib/python3.7/site-packages/pyspark/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml Ivy Default Cache set to: /home/p105327_t1/.ivy2/cache The jars for the packages stored in: /home/p105327_t1/.ivy2/jars com.microsoft.azure#synapseml_2.12 added as a dependency :: resolving dependencies :: org.apache.spark#spark-submit-parent-0a41b358-7270-4203-b8f7-6917e3a1de6c;1.0 confs: [default] :: resolution report :: resolve 509261ms :: artifacts dl 1ms :: modules in use: --------------------------------------------------------------------- | | modules || artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| --------------------------------------------------------------------- | default | 1 | 0 | 0 | 0 || 0 | 0 | --------------------------------------------------------------------- :: problems summary :: :::: WARNINGS module not found: com.microsoft.azure#synapseml_2.12;0.10.2 ==== local-m2-cache: tried file:/home/p105327_t1/.m2/repository/com/microsoft/azure/synapseml_2.12/0.10.2/synapseml_2.12-0.10.2.pom -- artifact com.microsoft.azure#synapseml_2.12;0.10.2!synapseml_2.12.jar: file:/home/p105327_t1/.m2/repository/com/microsoft/azure/synapseml_2.12/0.10.2/synapseml_2.12-0.10.2.jar ==== local-ivy-cache: tried /home/p105327_t1/.ivy2/local/com.microsoft.azure/synapseml_2.12/0.10.2/ivys/ivy.xml -- artifact com.microsoft.azure#synapseml_2.12;0.10.2!synapseml_2.12.jar: /home/p105327_t1/.ivy2/local/com.microsoft.azure/synapseml_2.12/0.10.2/jars/synapseml_2.12.jar ==== central: tried https://repo1.maven.org/maven2/com/microsoft/azure/synapseml_2.12/0.10.2/synapseml_2.12-0.10.2.pom -- artifact com.microsoft.azure#synapseml_2.12;0.10.2!synapseml_2.12.jar: https://repo1.maven.org/maven2/com/microsoft/azure/synapseml_2.12/0.10.2/synapseml_2.12-0.10.2.jar ==== spark-packages: tried https://repos.spark-packages.org/com/microsoft/azure/synapseml_2.12/0.10.2/synapseml_2.12-0.10.2.pom -- artifact com.microsoft.azure#synapseml_2.12;0.10.2!synapseml_2.12.jar: https://repos.spark-packages.org/com/microsoft/azure/synapseml_2.12/0.10.2/synapseml_2.12-0.10.2.jar :::::::::::::::::::::::::::::::::::::::::::::: :: UNRESOLVED DEPENDENCIES :: :::::::::::::::::::::::::::::::::::::::::::::: :: com.microsoft.azure#synapseml_2.12;0.10.2: not found :::::::::::::::::::::::::::::::::::::::::::::: :::: ERRORS Server access error at url https://repo1.maven.org/maven2/com/microsoft/azure/synapseml_2.12/0.10.2/synapseml_2.12-0.10.2.pom (java.net.ConnectException: Connection timed out (Connection timed out)) Server access error at url https://repo1.maven.org/maven2/com/microsoft/azure/synapseml_2.12/0.10.2/synapseml_2.12-0.10.2.jar (java.net.ConnectException: Connection timed out (Connection timed out)) Server access error at url https://repos.spark-packages.org/com/microsoft/azure/synapseml_2.12/0.10.2/synapseml_2.12-0.10.2.pom (java.net.ConnectException: Connection timed out (Connection timed out)) Server access error at url https://repos.spark-packages.org/com/microsoft/azure/synapseml_2.12/0.10.2/synapseml_2.12-0.10.2.jar (java.net.ConnectException: Connection timed out (Connection timed out)) :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: com.microsoft.azure#synapseml_2.12;0.10.2: not found] at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1429) at org.apache.spark.deploy.DependencyUtils$.resolveMavenDependencies(DependencyUtils.scala:54) at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:308) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Traceback (most recent call last): File "/app/ide/anaconda/envs/telematics_env_202211/lib/python3.7/code.py", line 90, in runcode exec(code, self.locals) File "", line 2, in File "/app/ide/anaconda/envs/telematics_env_202211/lib/python3.7/site-packages/pyspark/sql/session.py", line 228, in getOrCreate sc = SparkContext.getOrCreate(sparkConf) File "/lib/python3.7/site-packages/pyspark/context.py", line 384, in getOrCreate SparkContext(conf=conf or SparkConf()) File "/lib/python3.7/site-packages/pyspark/context.py", line 144, in init SparkContext._ensure_initialized(self, gateway=gateway, conf=conf) File "/lib/python3.7/site-packages/pyspark/context.py", line 331, in _ensure_initialized SparkContext._gateway = gateway or launch_gateway(conf) File "/lib/python3.7/site-packages/pyspark/java_gateway.py", line 108, in launch_gateway raise Exception("Java gateway process exited before sending its port number") Exception: Java gateway process exited before sending its port number

Code to reproduce issue

import pyspark spark = pyspark.sql.SparkSession.builder.appName("MyApp").config("spark.jars.packages", "com.microsoft.azure:synapseml_2.12:0.10.2").getOrCreate()

Other info / logs

No response

What component(s) does this bug affect?

  • [ ] area/cognitive: Cognitive project
  • [X] area/core: Core project
  • [ ] area/deep-learning: DeepLearning project
  • [ ] area/lightgbm: Lightgbm project
  • [ ] area/opencv: Opencv project
  • [ ] area/vw: VW project
  • [ ] area/website: Website
  • [ ] area/build: Project build system
  • [ ] area/notebooks: Samples under notebooks folder
  • [ ] area/docker: Docker usage
  • [ ] area/models: models related issue

What language(s) does this bug affect?

  • [ ] language/scala: Scala source code
  • [X] language/python: Pyspark APIs
  • [ ] language/r: R APIs
  • [ ] language/csharp: .NET APIs
  • [ ] language/new: Proposals for new client languages

What integration(s) does this bug affect?

  • [ ] integrations/synapse: Azure Synapse integrations
  • [ ] integrations/azureml: Azure ML integrations
  • [ ] integrations/databricks: Databricks integrations

shazriz avatar Feb 03 '23 12:02 shazriz

Hey @shazriz :wave:! Thank you so much for reporting the issue/feature request :rotating_light:. Someone from SynapseML Team will be looking to triage this issue soon. We appreciate your patience.

github-actions[bot] avatar Feb 03 '23 12:02 github-actions[bot]

Hi @shazriz If you're using spark 3.1.2 please use version 0.9.5-13-d1b51517-SNAPSHOT. The installation guidance is here: https://microsoft.github.io/SynapseML/docs/getting_started/installation/

serena-ruan avatar Feb 06 '23 08:02 serena-ruan

Hey @serena-ruan I have also tried the version you mentioned above and I get a similar error.

:: loading settings :: url = jar:file:/app/ide/anaconda/envs/telematics_env_202211/lib/python3.7/site-packages/pyspark/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml Ivy Default Cache set to: /home/p105327_t1/.ivy2/cache The jars for the packages stored in: /home/p105327_t1/.ivy2/jars com.microsoft.azure#synapseml_2.12 added as a dependency :: resolving dependencies :: org.apache.spark#spark-submit-parent-31d7ecd5-add7-408e-b30c-7c20846a1a6e;1.0 confs: [default] :: resolution report :: resolve 763889ms :: artifacts dl 0ms :: modules in use: --------------------------------------------------------------------- | | modules || artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| --------------------------------------------------------------------- | default | 1 | 0 | 0 | 0 || 0 | 0 | --------------------------------------------------------------------- :: problems summary :: :::: WARNINGS module not found: com.microsoft.azure#synapseml_2.12;0.9.5-13-d1b51517-SNAPSHOT ==== local-m2-cache: tried file:/home/p105327_t1/.m2/repository/com/microsoft/azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/synapseml_2.12-0.9.5-13-d1b51517-SNAPSHOT.pom -- artifact com.microsoft.azure#synapseml_2.12;0.9.5-13-d1b51517-SNAPSHOT!synapseml_2.12.jar: file:/home/p105327_t1/.m2/repository/com/microsoft/azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/synapseml_2.12-0.9.5-13-d1b51517-SNAPSHOT.jar ==== local-ivy-cache: tried /home/p105327_t1/.ivy2/local/com.microsoft.azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/ivys/ivy.xml -- artifact com.microsoft.azure#synapseml_2.12;0.9.5-13-d1b51517-SNAPSHOT!synapseml_2.12.jar: /home/p105327_t1/.ivy2/local/com.microsoft.azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/jars/synapseml_2.12.jar ==== central: tried https://repo1.maven.org/maven2/com/microsoft/azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/synapseml_2.12-0.9.5-13-d1b51517-SNAPSHOT.pom -- artifact com.microsoft.azure#synapseml_2.12;0.9.5-13-d1b51517-SNAPSHOT!synapseml_2.12.jar: https://repo1.maven.org/maven2/com/microsoft/azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/synapseml_2.12-0.9.5-13-d1b51517-SNAPSHOT.jar ==== spark-packages: tried https://repos.spark-packages.org/com/microsoft/azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/synapseml_2.12-0.9.5-13-d1b51517-SNAPSHOT.pom -- artifact com.microsoft.azure#synapseml_2.12;0.9.5-13-d1b51517-SNAPSHOT!synapseml_2.12.jar: https://repos.spark-packages.org/com/microsoft/azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/synapseml_2.12-0.9.5-13-d1b51517-SNAPSHOT.jar :::::::::::::::::::::::::::::::::::::::::::::: :: UNRESOLVED DEPENDENCIES :: :::::::::::::::::::::::::::::::::::::::::::::: :: com.microsoft.azure#synapseml_2.12;0.9.5-13-d1b51517-SNAPSHOT: not found :::::::::::::::::::::::::::::::::::::::::::::: :::: ERRORS Server access error at url https://repo1.maven.org/maven2/com/microsoft/azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/maven-metadata.xml (java.net.ConnectException: Connection timed out (Connection timed out)) Server access error at url https://repo1.maven.org/maven2/com/microsoft/azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/synapseml_2.12-0.9.5-13-d1b51517-SNAPSHOT.pom (java.net.ConnectException: Connection timed out (Connection timed out)) Server access error at url https://repo1.maven.org/maven2/com/microsoft/azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/synapseml_2.12-0.9.5-13-d1b51517-SNAPSHOT.jar (java.net.ConnectException: Connection timed out (Connection timed out)) Server access error at url https://repos.spark-packages.org/com/microsoft/azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/maven-metadata.xml (java.net.ConnectException: Connection timed out (Connection timed out)) Server access error at url https://repos.spark-packages.org/com/microsoft/azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/synapseml_2.12-0.9.5-13-d1b51517-SNAPSHOT.pom (java.net.ConnectException: Connection timed out (Connection timed out)) Server access error at url https://repos.spark-packages.org/com/microsoft/azure/synapseml_2.12/0.9.5-13-d1b51517-SNAPSHOT/synapseml_2.12-0.9.5-13-d1b51517-SNAPSHOT.jar (java.net.ConnectException: Connection timed out (Connection timed out)) :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: com.microsoft.azure#synapseml_2.12;0.9.5-13-d1b51517-SNAPSHOT: not found] at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1429) at org.apache.spark.deploy.DependencyUtils$.resolveMavenDependencies(DependencyUtils.scala:54) at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:308) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Traceback (most recent call last): File "/lib/python3.7/code.py", line 90, in runcode exec(code, self.locals) File "", line 2, in File "/lib/python3.7/site-packages/pyspark/sql/session.py", line 228, in getOrCreate sc = SparkContext.getOrCreate(sparkConf) File "/lib/python3.7/site-packages/pyspark/context.py", line 384, in getOrCreate SparkContext(conf=conf or SparkConf()) File "/lib/python3.7/site-packages/pyspark/context.py", line 144, in init SparkContext._ensure_initialized(self, gateway=gateway, conf=conf) File "/app/ide/anaconda/envs/telematics_env_202211/lib/python3.7/site-packages/pyspark/context.py", line 331, in _ensure_initialized SparkContext._gateway = gateway or launch_gateway(conf) File "/lib/python3.7/site-packages/pyspark/java_gateway.py", line 108, in launch_gateway raise Exception("Java gateway process exited before sending its port number") Exception: Java gateway process exited before sending its port number

shazriz avatar Feb 06 '23 13:02 shazriz

Hi @shazriz Did you add our custom maven resolver? "spark.jars.repositories": "https://mmlspark.azureedge.net/maven" And from your logs it looks like an issue with your network: (java.net.ConnectException: Connection timed out (Connection timed out)

serena-ruan avatar Feb 08 '23 00:02 serena-ruan

Hey, @serena-ruan let me check and get back to you. Thanks for your response.

shazriz avatar Feb 08 '23 18:02 shazriz

Hello, I have a similar issue. I have tried to install following the code of the official web site in a conda environment with python 3.10 and pyspark 3.2:

import pyspark spark = pyspark.sql.SparkSession.builder.appName("MyApp") \ # Use 0.11.2-spark3.3 version for Spark3.3 and 0.11.2 version for Spark3.2 .config("spark.jars.packages", "com.microsoft.azure:synapseml_2.12:0.11.2") \ .config("spark.jars.repositories", "https://mmlspark.azureedge.net/maven") \ .getOrCreate() import synapse.ml

And it reports de if report the error below. I have also tried with different maven repositories, using local jar files and trying to exclude the packages. I know that is not a issue caused by synapse buy I would appreciate some helps.

Thanks,

:::: WARNINGS [NOT FOUND ] org.apache.httpcomponents#httpcore;4.4.13!httpcore.jar (3ms)

==== local-m2-cache: tried

  file:/home/rbelda/.m2/repository/org/apache/httpcomponents/httpcore/4.4.13/httpcore-4.4.13.jar

	[NOT FOUND  ] net.sourceforge.f2j#arpack_combined_all;0.1!arpack_combined_all.jar (2ms)

==== local-m2-cache: tried

  file:/home/rbelda/.m2/repository/net/sourceforge/f2j/arpack_combined_all/0.1/arpack_combined_all-0.1-javadoc.jar

	::::::::::::::::::::::::::::::::::::::::::::::

	::              FAILED DOWNLOADS            ::

	:: ^ see resolution messages for details  ^ ::

	::::::::::::::::::::::::::::::::::::::::::::::

	:: org.apache.httpcomponents#httpcore;4.4.13!httpcore.jar

	:: net.sourceforge.f2j#arpack_combined_all;0.1!arpack_combined_all.jar

	::::::::::::::::::::::::::::::::::::::::::::::

rbeldagarcia avatar Sep 07 '23 10:09 rbeldagarcia