beakerx
beakerx copied to clipboard
Declare sparkSession and sparkContext as transient variables
Spark magic %%spark --start
does not declare spark context as a transient variable and due to this doing any sort of RDD manipulations will throw the below error
Caused by: java.io.NotSerializableException: org.apache.spark.SparkContext
Serialization stack:
- object not serializable (class: org.apache.spark.SparkContext, value: org.apache.spark.SparkContext@71836471)
- field (class: $iw, name: sc, type: class org.apache.spark.SparkContext)
the fix is to declare https://github.com/twosigma/beakerx/blob/d6233eb3baa8a8528e415401985da9e2e2d4ccfe/kernel/sparkex/src/main/java/com/twosigma/beakerx/widget/SparkEngineBase.java#L160 as transient variables
"@transient val %s = SparkVariable.getSparkSession()\n" +
"@transient val %s = %s.sparkContext\n" +
Please let me know if there is any other way to workaround this in the latest version
Hi @kiranchitturi Could you provide some simple example (ipynb) that will allow us to reproduce the problem ?