sbt-spark-package Package from spPublishLocal not usable due to scala version appearing in the ivy module name

Up until now I've used sbt assembly, but now I'm trying to work with spPublishLocal for packaging before performing (automated) integration tests (e.g. on travis) for http://spark-packages.org/package/TargetHolding/pyspark-cassandra.

I've build pyspark-cassandra with:

sbt spPublishLocal

When I run pyspark with

PYSPARK_DRIVER_PYTHON=ipython \
path/to/spark-1.5.2-bin-hadoop2.6/bin/pyspark \
--conf spark.cassandra.connection.host=localhost \
--driver-memory 2g \
--master local[*] \
--packages TargetHolding/pyspark-cassandra:0.3.0

I get:

Python 2.7.10 (default, Sep 24 2015, 17:50:09) 
Type "copyright", "credits" or "license" for more information.

IPython 2.4.1 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.
Ivy Default Cache set to: /home/frens-jan/.ivy2/cache
The jars for the packages stored in: /home/frens-jan/.ivy2/jars
:: loading settings :: url = jar:file:/home/frens-jan/Workspaces/tgho/spark/pyspark-cassandra/lib/spark-1.5.2-bin-hadoop2.6/lib/spark-assembly-1.5.2-hadoop2.6.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
TargetHolding#pyspark-cassandra added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
    confs: [default]
:: resolution report :: resolve 794ms :: artifacts dl 0ms
    :: modules in use:
    ---------------------------------------------------------------------
    |                  |            modules            ||   artifacts   |
    |       conf       | number| search|dwnlded|evicted|| number|dwnlded|
    ---------------------------------------------------------------------
    |      default     |   1   |   0   |   0   |   0   ||   0   |   0   |
    ---------------------------------------------------------------------

:: problems summary ::
:::: WARNINGS
        ::::::::::::::::::::::::::::::::::::::::::::::

        ::          UNRESOLVED DEPENDENCIES         ::

        ::::::::::::::::::::::::::::::::::::::::::::::

        :: TargetHolding#pyspark-cassandra;0.3.0: java.text.ParseException: inconsistent module descriptor file found in '/home/frens-jan/.ivy2/local/TargetHolding/pyspark-cassandra/0.3.0/ivys/ivy.xml': bad module name: expected='pyspark-cassandra' found='pyspark-cassandra_2.10'; 

        ::::::::::::::::::::::::::::::::::::::::::::::


:::: ERRORS
        local-ivy-cache: bad module name found in /home/frens-jan/.ivy2/local/TargetHolding/pyspark-cassandra/0.3.0/ivys/ivy.xml: expected='pyspark-cassandra found='pyspark-cassandra_2.10'


:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: TargetHolding#pyspark-cassandra;0.3.0: java.text.ParseException: inconsistent module descriptor file found in '/home/frens-jan/.ivy2/local/TargetHolding/pyspark-cassandra/0.3.0/ivys/ivy.xml': bad module name: expected='pyspark-cassandra' found='pyspark-cassandra_2.10'; ]
    at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1011)
    at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:286)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:153)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
/home/frens-jan/Workspaces/tgho/spark/pyspark-cassandra/lib/spark-1.5.2-bin-hadoop2.6/python/pyspark/shell.py in <module>()
     41     SparkContext.setSystemProperty("spark.executor.uri", os.environ["SPARK_EXECUTOR_URI"])
     42 
---> 43 sc = SparkContext(pyFiles=add_files)
     44 atexit.register(lambda: sc.stop())
     45 

/home/frens-jan/Workspaces/tgho/spark/pyspark-cassandra/lib/spark-1.5.2-bin-hadoop2.6/python/pyspark/context.pyc in __init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls)
    108         """
    109         self._callsite = first_spark_call() or CallSite(None, None, None)
--> 110         SparkContext._ensure_initialized(self, gateway=gateway)
    111         try:
    112             self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer,

/home/frens-jan/Workspaces/tgho/spark/pyspark-cassandra/lib/spark-1.5.2-bin-hadoop2.6/python/pyspark/context.pyc in _ensure_initialized(cls, instance, gateway)
    232         with SparkContext._lock:
    233             if not SparkContext._gateway:
--> 234                 SparkContext._gateway = gateway or launch_gateway()
    235                 SparkContext._jvm = SparkContext._gateway.jvm
    236 

/home/frens-jan/Workspaces/tgho/spark/pyspark-cassandra/lib/spark-1.5.2-bin-hadoop2.6/python/pyspark/java_gateway.pyc in launch_gateway()
     92                 callback_socket.close()
     93         if gateway_port is None:
---> 94             raise Exception("Java gateway process exited before sending the driver its port number")
     95 
     96         # In Windows, ensure the Java child processes do not linger after Python has exited.

Exception: Java gateway process exited before sending the driver its port number

In [1]:

Any ideas what I am doing wrong?

Jan 15 '16 17:01 frensjan

anyone?

Jan 31 '16 00:01 frensjan

This is a bug in conjunction with Spark. I've looked into it for a while, but couldn't resolve it on Spark side. I'll look deeper here

Mar 10 '16 00:03 brkyvz

@frensjan Did you ever resolve this?

Nov 02 '17 19:11 metasim

nope, sorry

Nov 17 '17 15:11 frensjan

ran into a similar issue, added the following for https://github.com/databricks/spark-deep-learning:

organization := "databricks"

name := "spark-deep-learning"

spName := organization.value + "/" + name.value

projectID := {ModuleID(organization.value, name.value, s"${version.value}-s_$scalaMajorVersion")}

Apr 19 '18 02:04 touchdown