TransmogrifAI icon indicating copy to clipboard operation
TransmogrifAI copied to clipboard

java.lang.RuntimeException in Bootstrap your First Project

Open shoafj7917 opened this issue 5 years ago • 13 comments

Describe the bug After auto generating a project training the data always results in the same Java Runtime Error.

To Reproduce Command as given from the README.txt

./gradlew -q sparkSubmit -Dmain=com.salesforce.app.Titanic -Dargs="--run-type=train --model-location /home/TransmogrifAI/./titanic/build/spark/model 
--read-location Passenger=/home/TransmogrifAI/test-data/PassengerDataAll.csv"

Logs or screenshots

Using properties file: null
Parsed arguments:
  master                  local[*]
  deployMode              client
  executorMemory          2G
  executorCores           null
  totalExecutorCores      null
  propertiesFile          null
  driverMemory            4G
  driverCores             1
  driverExtraClassPath    null
  driverExtraLibraryPath  null
  driverExtraJavaOptions  null
  supervise               false
  queue                   null
  numExecutors            null
  files                   null
  pyFiles                 null
  archives                null
  mainClass               com.salesforce.app.Titanic
  primaryResource         file:/home/Desktop/TransmogrifAI/titanic/build/install/titanic/lib/titanic-0.0.1.jar
  name                    titanic:com.salesforce.app.Titanic
  childArgs               [--run-type=train --model-location /home/Desktop/TransmogrifAI/./titanic/build/spark/model --read-location Passenger=/home/Desktop/TransmogrifAI/test-data/PassengerDataAll.csv]
...
...

19/09/13 00:08:10 INFO Titanic$: Parsed config:
{
  "runType" : "Train",
  "defaultParams" : {
    "stageParams" : { },
    "readerParams" : { },
    "customParams" : { },
    "alternateReaderParams" : { }
  },
  "readLocations" : {
    "Passenger" : "/home/Desktop/TransmogrifAI/test-data/PassengerDataAll.csv"
  },
  "modelLocation" : "/home/Desktop/TransmogrifAI/./titanic/build/spark/model"
}

Exception in thread "main" java.lang.RuntimeException: Failed to write out stage 'FeatureGeneratorStage_000000000005'
        at com.salesforce.op.stages.OpPipelineStageWriter.writeToJson(OpPipelineStageWriter.scala:81)
        at com.salesforce.op.OpWorkflowModelWriter$$anonfun$3.apply(OpWorkflowModelWriter.scala:131)
        at com.salesforce.op.OpWorkflowModelWriter$$anonfun$3.apply(OpWorkflowModelWriter.scala:131)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
        at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
        at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
        at com.salesforce.op.OpWorkflowModelWriter.stagesJArray(OpWorkflowModelWriter.scala:131)
        at com.salesforce.op.OpWorkflowModelWriter.stagesJArray(OpWorkflowModelWriter.scala:108)
        at com.salesforce.op.OpWorkflowModelWriter.toJson(OpWorkflowModelWriter.scala:83)
        at com.salesforce.op.OpWorkflowModelWriter.toJsonString(OpWorkflowModelWriter.scala:68)
        at com.salesforce.op.OpWorkflowModelWriter.saveImpl(OpWorkflowModelWriter.scala:58)
        at org.apache.spark.ml.util.MLWriter.save(ReadWrite.scala:103)
        at com.salesforce.op.OpWorkflowModelWriter$.save(OpWorkflowModelWriter.scala:193)
        at com.salesforce.op.OpWorkflowModel.save(OpWorkflowModel.scala:221)
        at com.salesforce.op.OpWorkflowRunner.train(OpWorkflowRunner.scala:165)
        at com.salesforce.op.OpWorkflowRunner.run(OpWorkflowRunner.scala:308)
        at com.salesforce.op.OpAppWithRunner.run(OpApp.scala:211)
        at com.salesforce.op.OpApp.main(OpApp.scala:182)
        at com.salesforce.app.Titanic.main(Titanic.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.RuntimeException: Argument 'extractFn' [com.salesforce.app.Features$$anonfun$5] cannot be serialized. Make sure com.salesforce.app.Features$$anonfun$5 has either no-args ctor or is an object, and does not have any external dependencies, e.g. use any out of scope variables.
        at com.salesforce.op.stages.OpPipelineStageSerializationFuns$class.serializeArgument(OpPipelineStageReaderWriter.scala:234)
        at com.salesforce.op.stages.DefaultValueReaderWriter.serializeArgument(DefaultValueReaderWriter.scala:48)
        at com.salesforce.op.stages.DefaultValueReaderWriter$$anonfun$write$1.apply(DefaultValueReaderWriter.scala:70)
        at com.salesforce.op.stages.DefaultValueReaderWriter$$anonfun$write$1.apply(DefaultValueReaderWriter.scala:69)
        at scala.util.Try$.apply(Try.scala:192)
        at com.salesforce.op.stages.DefaultValueReaderWriter.write(DefaultValueReaderWriter.scala:69)
        at com.salesforce.op.stages.FeatureGeneratorStageReaderWriter.write(FeatureGeneratorStage.scala:189)
        at com.salesforce.op.stages.FeatureGeneratorStageReaderWriter.write(FeatureGeneratorStage.scala:129)
        at com.salesforce.op.stages.OpPipelineStageWriter.writeToJson(OpPipelineStageWriter.scala:80)
        ... 31 more
Caused by: java.lang.RuntimeException: Failed to create an instance of class 'com.salesforce.app.Features$$anonfun$5'. Class has to either have a no-args ctor or be an object.
        at com.salesforce.op.utils.reflection.ReflectionUtils$.newInstance(ReflectionUtils.scala:106)
        at com.salesforce.op.utils.reflection.ReflectionUtils$.newInstance(ReflectionUtils.scala:87)
        at com.salesforce.op.stages.OpPipelineStageSerializationFuns$class.serializeArgument(OpPipelineStageReaderWriter.scala:231)
        ... 39 more
Caused by: java.lang.NoSuchFieldException: MODULE$
        at java.lang.Class.getField(Class.java:1703)
        at com.salesforce.op.utils.reflection.ReflectionUtils$.newInstance(ReflectionUtils.scala:102)
        ... 41 more

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':sparkSubmit'.
     Process 'command '/home/spark-2.3.3-bin-hadoop2.7/bin/spark-submit'' finished with non-zero exit value 1

Additional context I am on Ubuntu 18.04 with: spark-2.3.3-bin-hadoop2.7 java openjdk version "1.8.0_222" TransmogrifAI 0.6.1

Hope for your help, Thanks

shoafj7917 avatar Sep 13 '19 04:09 shoafj7917

Thank you for reporting this. This is definitely a bug related to the recent serialization changes we made for our models. We will try to fix it asap.

In the meantime you can try using 0.5.x version.

tovbinm avatar Sep 13 '19 05:09 tovbinm

Is there a timeline when this will be fixed?

shoafj7917 avatar Oct 09 '19 22:10 shoafj7917

This requires changing our code generator templates for feature engineering in CLI so it would generate the concrete feature extractor classes. Perhaps @vpatryshev or @wsuchy can have a look?

tovbinm avatar Oct 10 '19 18:10 tovbinm

Sure can do; can I have more details?

On Thu, Oct 10, 2019 at 11:25 AM Matthew Tovbin [email protected] wrote:

This requires changing our code generator templates for feature engineering in CLI so it would generate the concrete feature extractor classes. Perhaps @vpatryshev https://github.com/vpatryshev or @wsuchy https://github.com/wsuchy can have a look?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/salesforce/TransmogrifAI/issues/408?email_source=notifications&email_token=AAB24KYXDOHQCE4CVNYAKHLQN5XR5A5CNFSM4IWMANGKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEA5JZQA#issuecomment-540712128, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB24KZ7LQF5TPCEWUL77QDQN5XR5ANCNFSM4IWMANGA .

-- Thanks, -Vlad

vpatryshev avatar Oct 10 '19 18:10 vpatryshev

@vpatryshev it's similar to what @gerashegalov done here in this PR - https://github.com/salesforce/TransmogrifAI/pull/406

For example this extractor code with anonymous function:

val rowId = FeatureBuilder.Integral[BostonHouse].extract(_.rowId.toIntegral).asPredictor

Has to be replaced with with concrete class:

val rowId = FeatureBuilder.Integral[BostonHouse].extract(new RowId).asPredictor

object BostonFeatures {
    class IntegralExtract(f: BostonHouse => Int) extends BostonFeatureFunc[Integral] {
        override def apply(v1: BostonHouse): Integral = f(v1).toIntegral
    }
    class RowId extends IntegralExtract(_.rowId)
}

tovbinm avatar Oct 10 '19 19:10 tovbinm

@vpatryshev any progress on this one? thanks!

tovbinm avatar Nov 14 '19 19:11 tovbinm

Oh, did not even touch. I keep it on my mind all the time, though.

Thanks, -Vlad

On Thu, Nov 14, 2019 at 11:27 AM Matthew Tovbin [email protected] wrote:

@vpatryshev https://github.com/vpatryshev any progress on this one? thanks!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/salesforce/TransmogrifAI/issues/408?email_source=notifications&email_token=AAB24K2IHUL57ZXYYJ4NYA3QTWRCDA5CNFSM4IWMANGKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEDABQI#issuecomment-554041537, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB24KZL7YDKVJJTE3E6T7LQTWRCDANCNFSM4IWMANGA .

vpatryshev avatar Nov 15 '19 02:11 vpatryshev

FYI. Working on it now.

vpatryshev avatar Jan 18 '20 00:01 vpatryshev

@shoafj7917, can you please check that you can reproduce this with the latest version of TransmogrifAI? What you write is not runnable on the current version, and I don't see how to reproduce this behavior. Which directory do you run it in?

vpatryshev avatar Jan 18 '20 04:01 vpatryshev

When I tried with version 0.6.1, failed as the same manner with @shoafj7917.

@vpatryshev . latest version of TransmogrifAI mean master branch?

master branch fails with follow error.

short,

 Could not find com.salesforce.transmogrifai:transmogrifai-core_2.11:0.6.2-SNAPSHOT.

full,

15:47:36.118 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter] * What went wrong:
15:47:36.118 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter] Execution failed for task ':compileJava'.
15:47:36.118 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter] > Could not resolve all files for configuration ':compileClasspath'.
15:47:36.119 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]    > Could not find com.salesforce.transmogrifai:transmogrifai-core_2.11:0.6.2-SNAPSHOT.
15:47:36.119 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]      Searched in the following locations:
15:47:36.119 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]        - https://jcenter.bintray.com/com/salesforce/transmogrifai/transmogrifai-core_2.11/0.6.2-SNAPSHOT/maven-metadata.xml
15:47:36.119 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]        - https://jcenter.bintray.com/com/salesforce/transmogrifai/transmogrifai-core_2.11/0.6.2-SNAPSHOT/transmogrifai-core_2.11-0.6.2-SNAPSHOT.pom
15:47:36.119 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]        - https://jcenter.bintray.com/com/salesforce/transmogrifai/transmogrifai-core_2.11/0.6.2-SNAPSHOT/transmogrifai-core_2.11-0.6.2-SNAPSHOT.jar
15:47:36.119 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]        - https://repo.maven.apache.org/maven2/com/salesforce/transmogrifai/transmogrifai-core_2.11/0.6.2-SNAPSHOT/maven-metadata.xml
15:47:36.119 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]        - https://repo.maven.apache.org/maven2/com/salesforce/transmogrifai/transmogrifai-core_2.11/0.6.2-SNAPSHOT/transmogrifai-core_2.11-0.6.2-SNAPSHOT.pom
15:47:36.119 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]        - https://repo.maven.apache.org/maven2/com/salesforce/transmogrifai/transmogrifai-core_2.11/0.6.2-SNAPSHOT/transmogrifai-core_2.11-0.6.2-SNAPSHOT.jar
15:47:36.119 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]      Required by:
15:47:36.119 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]          project :
15:47:36.119 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]
15:47:36.119 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter] * Try:
15:47:36.119 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter] Run with --stacktrace option to get the stack trace.  Run with --scan to get full insights.
15:47:36.120 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]

jeesim2 avatar Jan 25 '20 07:01 jeesim2

The master version was not published yet, but you can do it locally.

$ git clone [email protected]:salesforce/TransmogrifAI.git
$ cd TransmogrifAI
$ ./gradlew publishToMavenLocal

tovbinm avatar Jan 25 '20 16:01 tovbinm

Even after install master branch to local maven repository then add mavenLocal() to generated bootstrap project's build.gradle file, Bootstrap Project training still fails as follow,

nojihun-ui-MacBook-Pro:titanic jihun$ ./gradlew sparkSubmit -Dmain=com.salesforce.app.Titanic -Dargs="--run-type=train --model-location=/tmp/titanic-model --read-location Passenger=`pwd`/../test-data/PassengerDataAll.csv"

> Task :sparkSubmit
Using properties file: null
Parsed arguments:
  master                  local[*]
  deployMode              client
  executorMemory          2G
  executorCores           null
  totalExecutorCores      null
  propertiesFile          null
  driverMemory            4G
  driverCores             1
  driverExtraClassPath    null
  driverExtraLibraryPath  null
  driverExtraJavaOptions  null
  supervise               false
  queue                   null
  numExecutors            null
  files                   null
  pyFiles                 null
  archives                null
  mainClass               com.salesforce.app.Titanic
  primaryResource         file:/work_base/git_repo/TransmogrifAI/titanic/build/install/titanic/lib/titanic-0.0.1.jar
  name                    titanic:com.salesforce.app.Titanic
  childArgs               [--run-type=train --model-location=/tmp/titanic-model --read-location Passenger=/work_base/git_repo/TransmogrifAI/titanic/../test-data/PassengerDataAll.csv]
  jars                    ...........
  packages                null
  packagesExclusions      null
  repositories            null
  verbose                 true

Spark properties used, including those specified through
 --conf and those from the properties file null:
  (spark.driver.memory,4G)
  (spark.serializer,org.apache.spark.serializer.KryoSerializer)


Main class:
com.salesforce.app.Titanic
Arguments:
--run-type=train
--model-location=/tmp/titanic-model
--read-location
Passenger=/work_base/git_repo/TransmogrifAI/titanic/../test-data/PassengerDataAll.csv
Spark config:
(spark.serializer,org.apache.spark.serializer.KryoSerializer)
(spark.jars.............
20/01/30 06:30:03 INFO Titanic$: Parsed config:
{
  "runType" : "Train",
  "defaultParams" : {
    "stageParams" : { },
    "readerParams" : { },
    "customParams" : { },
    "alternateReaderParams" : { }
  },
  "readLocations" : {
    "Passenger" : "/work_base/git_repo/TransmogrifAI/titanic/../test-data/PassengerDataAll.csv"
  },
  "modelLocation" : "/tmp/titanic-model"
}
Exception in thread "main" java.lang.RuntimeException: Failed to write out stage 'FeatureGeneratorStage_000000000005'
        at com.salesforce.op.stages.OpPipelineStageWriter.writeToJson(OpPipelineStageWriter.scala:81)
        at com.salesforce.op.OpWorkflowModelWriter$$anonfun$3.apply(OpWorkflowModelWriter.scala:131)
        at com.salesforce.op.OpWorkflowModelWriter$$anonfun$3.apply(OpWorkflowModelWriter.scala:131)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
        at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
        at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
        at com.salesforce.op.OpWorkflowModelWriter.stagesJArray(OpWorkflowModelWriter.scala:131)
        at com.salesforce.op.OpWorkflowModelWriter.stagesJArray(OpWorkflowModelWriter.scala:108)
        at com.salesforce.op.OpWorkflowModelWriter.toJson(OpWorkflowModelWriter.scala:83)
        at com.salesforce.op.OpWorkflowModelWriter.toJsonString(OpWorkflowModelWriter.scala:68)
        at com.salesforce.op.OpWorkflowModelWriter.saveImpl(OpWorkflowModelWriter.scala:58)
        at org.apache.spark.ml.util.MLWriter.save(ReadWrite.scala:103)
        at com.salesforce.op.OpWorkflowModelWriter$.save(OpWorkflowModelWriter.scala:193)
        at com.salesforce.op.OpWorkflowModel.save(OpWorkflowModel.scala:221)
        at com.salesforce.op.OpWorkflowRunner.train(OpWorkflowRunner.scala:165)
        at com.salesforce.op.OpWorkflowRunner.run(OpWorkflowRunner.scala:308)
        at com.salesforce.op.OpAppWithRunner.run(OpApp.scala:211)
        at com.salesforce.op.OpApp.main(OpApp.scala:182)
        at com.salesforce.app.Titanic.main(Titanic.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.RuntimeException: Argument 'extractFn' [com.salesforce.app.Features$$anonfun$5] cannot be serialized. Make sure com.salesforce.app.Features$$anonfun$5 has either no-args ctor or is an object, and does not have any external dependencies, e.g. use any out of scope variables.
        at com.salesforce.op.stages.OpPipelineStageSerializationFuns$class.serializeArgument(OpPipelineStageReaderWriter.scala:236)
        at com.salesforce.op.stages.DefaultValueReaderWriter.serializeArgument(DefaultValueReaderWriter.scala:48)
        at com.salesforce.op.stages.DefaultValueReaderWriter$$anonfun$write$1.apply(DefaultValueReaderWriter.scala:70)
        at com.salesforce.op.stages.DefaultValueReaderWriter$$anonfun$write$1.apply(DefaultValueReaderWriter.scala:69)
        at scala.util.Try$.apply(Try.scala:192)
        at com.salesforce.op.stages.DefaultValueReaderWriter.write(DefaultValueReaderWriter.scala:69)
        at com.salesforce.op.stages.FeatureGeneratorStageReaderWriter.write(FeatureGeneratorStage.scala:189)
        at com.salesforce.op.stages.FeatureGeneratorStageReaderWriter.write(FeatureGeneratorStage.scala:129)
        at com.salesforce.op.stages.OpPipelineStageWriter.writeToJson(OpPipelineStageWriter.scala:80)
        ... 31 more
Caused by: java.lang.RuntimeException: Failed to create an instance of class 'com.salesforce.app.Features$$anonfun$5'. Class has to either have a no-args ctor or be an object.
        at com.salesforce.op.utils.reflection.ReflectionUtils$.newInstance(ReflectionUtils.scala:106)
        at com.salesforce.op.utils.reflection.ReflectionUtils$.newInstance(ReflectionUtils.scala:87)
        at com.salesforce.op.stages.OpPipelineStageSerializationFuns$class.serializeArgument(OpPipelineStageReaderWriter.scala:233)
        ... 39 more
Caused by: java.lang.NoSuchFieldException: MODULE$
        at java.lang.Class.getField(Class.java:1703)
        at com.salesforce.op.utils.reflection.ReflectionUtils$.newInstance(ReflectionUtils.scala:102)
        ... 41 more

> Task :sparkSubmit FAILED

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':sparkSubmit'.
> Process 'command '/Users/jihun/apps/spark/bin/spark-submit'' finished with non-zero exit value 1

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.

* Get more help at https://help.gradle.org

BUILD FAILED in 8m 0s
7 actionable tasks: 2 executed, 5 up-to-date
nojihun-ui-MacBook-Pro:titanic jihun$

jeesim2 avatar Jan 29 '20 21:01 jeesim2

In order to fix we would need to modify the template used to generate the project. Until fixed, I would recommend to start with existing examples and create you project manually in a similar fashion.

tovbinm avatar Sep 09 '21 05:09 tovbinm