SynapseML
SynapseML copied to clipboard
pyspark --packages com.microsoft.ml.spark:mmlspark_2.11:1.0.0-rc1
Command
pyspark --packages com.microsoft.ml.spark:mmlspark_2.11:1.0.0-rc1
returns an error.
unresolved dependencies
com.microsoft.ml.spark:mmlspark_2.11;1.0.0-rc1 not found
I'm not sure if this is relevant but there's a semicolon instead of a colon in that error. This is with Python 3.7.3, Apache Maven 3.6.1, jdk1.8.0_221, spark-2.4.3-bin-hadoop2.7
👋 Thanks for opening your first issue here! If you're reporting a 🐞 bug, please make sure you include steps to reproduce it.
When I try to download version 1.0.0-rc1 through maven coordinates, I get
Couldn't download artifact: Missing:
com.microsoft.ml.spark:mmlspark_2.11:jar:1.0.0-rc1
(it worked fine with version 0.18.1)
@mhamilton723 any hint? P.S. nice speech at the Spark+AI Europe summit :)
Hey @pairwiserr and @candalfigomoro sorry for this, looks like sbt thought it was a snapshot because of the -rc. You can get around this for now by using our maven repo:
by putting this in your build.sbt
resolvers += "MMLSpark Repo" at "https://mmlspark.azureedge.net/maven"
or adding that url to your
spark.jars.repositories https://mmlspark.azureedge.net/maven
to your spark settings
@mhamilton723 Thank you!
If you want to download the JAR, this is the command I used:
mvn dependency:get -DremoteRepositories="https://mmlspark.azureedge.net/maven" -Dartifact="com.microsoft.ml.spark:mmlspark_2.11:1.0.0-rc1"
@mhamilton723
The method I am using (and am familiar with) is the following:
pyspark --packages com.microsoft.ml.spark:mmlspark_2.11:1.0.0-rc1
Is there no way to make this work to access the latest version? This is published on the main page (https://github.com/Azure/mmlspark#spark-package) - several methods shown there will fail currently.
@mhamilton723 if SBT and versioning scheme for this library does not play well, what are plans to resolve it?
Proposed workaround solution of pointing to "https://mmlspark.azureedge.net/maven" sets dangerous precedent as it is not easy to verify that domain is indeed owned by Microsoft and users will get used to of getting copy of library from some third-domains
bin/spark-shell --conf spark.jars.repositories=https://mmlspark.azureedge.net/maven --packages com.microsoft.ml.spark:mmlspark_2.11:1.0.0-rc3