spark-dependencies
spark-dependencies copied to clipboard
Support for Elasticsearch 8.x
Describe the bug
Today Spark dependencies only work with elasticsearch 7.x
To Reproduce Steps to reproduce the behavior:
- install elasticsearch 8.x
- launch spark dependencies
we have this in logs:
Exception in thread "main" org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only'
at org.elasticsearch.hadoop.rest.InitializationUtils.discoverClusterInfo(InitializationUtils.java:348)
at org.elasticsearch.hadoop.rest.RestService.findPartitions(RestService.java:220)
at org.elasticsearch.spark.rdd.AbstractEsRDD.esPartitions$lzycompute(AbstractEsRDD.scala:79)
at org.elasticsearch.spark.rdd.AbstractEsRDD.esPartitions(AbstractEsRDD.scala:78)
at org.elasticsearch.spark.rdd.AbstractEsRDD.getPartitions(AbstractEsRDD.scala:48)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:273)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:269)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:273)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:269)
at org.apache.spark.Partitioner$.$anonfun$defaultPartitioner$4(Partitioner.scala:78)
at org.apache.spark.Partitioner$.$anonfun$defaultPartitioner$4$adapted(Partitioner.scala:78)
at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
at scala.collection.immutable.List.foreach(List.scala:392)
at scala.collection.TraversableLike.map(TraversableLike.scala:238)
at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
at scala.collection.immutable.List.map(List.scala:298)
at org.apache.spark.Partitioner$.defaultPartitioner(Partitioner.scala:78)
at org.apache.spark.rdd.RDD.$anonfun$groupBy$1(RDD.scala:714)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:385)
at org.apache.spark.rdd.RDD.groupBy(RDD.scala:714)
at org.apache.spark.api.java.JavaRDDLike.groupBy(JavaRDDLike.scala:243)
at org.apache.spark.api.java.JavaRDDLike.groupBy$(JavaRDDLike.scala:239)
at org.apache.spark.api.java.AbstractJavaRDDLike.groupBy(JavaRDDLike.scala:45)
at io.jaegertracing.spark.dependencies.elastic.ElasticsearchDependenciesJob.run(ElasticsearchDependenciesJob.java:273)
at io.jaegertracing.spark.dependencies.elastic.ElasticsearchDependenciesJob.run(ElasticsearchDependenciesJob.java:249)
at io.jaegertracing.spark.dependencies.DependenciesSparkJob.run(DependenciesSparkJob.java:54)
at io.jaegertracing.spark.dependencies.DependenciesSparkJob.main(DependenciesSparkJob.java:40)
Caused by: org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Unsupported/Unknown Elasticsearch version [8.2.3].Highest supported version is [7.x]. You may need to upgrade ES-Hadoop.
at org.elasticsearch.hadoop.util.EsMajorVersion.parse(EsMajorVersion.java:91)
at org.elasticsearch.hadoop.rest.RestClient.mainInfo(RestClient.java:756)
at org.elasticsearch.hadoop.rest.InitializationUtils.discoverClusterInfo(InitializationUtils.java:338)
... 31 more
Expected behavior It would be nice to make it work with ES 8.x (Jaeger is "supporting" it)
Version (please complete the following information):
- OS: Linux
- Jaeger version: 1.31
- Deployment: Kubernetes
Additional context
maybe by using this in pom.xml:
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch-spark-20_${version.scala.binary}</artifactId>
<version>8.2.3</version>
</dependency>
For those of you who want to use this with elastic 8, here is the repo with the artefact built using the PR of @sylvainOL => https://github.com/vmaleze/spark-dependencies-es8/pkgs/container/spark-dependencies-es8 Unfortunately, the tests fails on the latest-jaeger stage and I have issues testing this on arm64. So I cannot fix it to submit a PR and merge this in the official repo. However, the forked repo works like a charm for elastic 8.