SynapseML icon indicating copy to clipboard operation
SynapseML copied to clipboard

pyspark --packages com.microsoft.ml.spark:mmlspark_2.11:1.0.0-rc1

Open pairwiserr opened this issue 4 years ago • 7 comments

Command

pyspark --packages com.microsoft.ml.spark:mmlspark_2.11:1.0.0-rc1

returns an error.

unresolved dependencies
com.microsoft.ml.spark:mmlspark_2.11;1.0.0-rc1 not found

I'm not sure if this is relevant but there's a semicolon instead of a colon in that error. This is with Python 3.7.3, Apache Maven 3.6.1, jdk1.8.0_221, spark-2.4.3-bin-hadoop2.7

pairwiserr avatar Oct 16 '19 16:10 pairwiserr

👋 Thanks for opening your first issue here! If you're reporting a 🐞 bug, please make sure you include steps to reproduce it.

welcome[bot] avatar Oct 16 '19 16:10 welcome[bot]

When I try to download version 1.0.0-rc1 through maven coordinates, I get

Couldn't download artifact: Missing:
com.microsoft.ml.spark:mmlspark_2.11:jar:1.0.0-rc1

(it worked fine with version 0.18.1)

@mhamilton723 any hint? P.S. nice speech at the Spark+AI Europe summit :)

candalfigomoro avatar Oct 21 '19 10:10 candalfigomoro

Hey @pairwiserr and @candalfigomoro sorry for this, looks like sbt thought it was a snapshot because of the -rc. You can get around this for now by using our maven repo:

by putting this in your build.sbt

resolvers += "MMLSpark Repo" at "https://mmlspark.azureedge.net/maven"

or adding that url to your spark.jars.repositories https://mmlspark.azureedge.net/maven

to your spark settings

mhamilton723 avatar Oct 26 '19 03:10 mhamilton723

@mhamilton723 Thank you!

If you want to download the JAR, this is the command I used: mvn dependency:get -DremoteRepositories="https://mmlspark.azureedge.net/maven" -Dartifact="com.microsoft.ml.spark:mmlspark_2.11:1.0.0-rc1"

candalfigomoro avatar Oct 28 '19 10:10 candalfigomoro

@mhamilton723

The method I am using (and am familiar with) is the following:

pyspark --packages com.microsoft.ml.spark:mmlspark_2.11:1.0.0-rc1

Is there no way to make this work to access the latest version? This is published on the main page (https://github.com/Azure/mmlspark#spark-package) - several methods shown there will fail currently.

allard-jeff avatar Nov 14 '19 17:11 allard-jeff

@mhamilton723 if SBT and versioning scheme for this library does not play well, what are plans to resolve it?

Proposed workaround solution of pointing to "https://mmlspark.azureedge.net/maven" sets dangerous precedent as it is not easy to verify that domain is indeed owned by Microsoft and users will get used to of getting copy of library from some third-domains

gburboz avatar Jun 29 '20 20:06 gburboz

bin/spark-shell --conf spark.jars.repositories=https://mmlspark.azureedge.net/maven --packages com.microsoft.ml.spark:mmlspark_2.11:1.0.0-rc3

wangyum avatar Sep 01 '21 09:09 wangyum