spark-lucenerdd
spark-lucenerdd copied to clipboard
Serialization Issue with org.apache.lucene.facet.FacetsConfig
I am facing the following serialization issue :
Job aborted due to stage failure: Task 144.0 in stage 25.0 (TID 2122) had a not serializable result: org.apache.lucene.facet.FacetsConfig
Serialization stack:
- object not serializable (class: org.apache.lucene.facet.FacetsConfig, value: org.apache.lucene.facet.FacetsConfig@53a75ca4)
- field (class: org.zouzias.spark.lucenerdd.partition.LuceneRDDPartition, name: FacetsConfig, type: class org.apache.lucene.facet.FacetsConfig)
- object (class org.zouzias.spark.lucenerdd.partition.LuceneRDDPartition, org.zouzias.spark.lucenerdd.partition.LuceneRDDPartition@30e83579)
It is hard to replicate and I am not sure what is triggering it as index works just fine sometimes.
Do you have any idea? @zouzias
Example of jobs. Some failed , some succeed(same code) :
Hi,
do you use faceted search at all? I would like to remove the faceted search feature since DataFrames with parquet files as a backend are superior.
See: https://github.com/zouzias/spark-lucenerdd/pull/171
I am not using that feature but indexing seems to be calling the FacetsConfig
This looks very suspicious. Can you share some code to help you reproduce the error?
- object (class org.zouzias.spark.lucenerdd.partition.LuceneRDDPartition, org.zouzias.spark.lucenerdd.partition.LuceneRDDPartition@30e83579)
It seems that the LuceneRDDPartition object is being serialized which it should never happen. Are you using the cartesianlinker
method?
Yes , I am using the cartesianlinker
I am not sure what triggers it , but I will update the issue if I can find a repeatable sample