project-website
project-website copied to clipboard
Adds new blog post announcing opensearch hadoop
Description
Adds a new blog post announcing the availability of the hadoop client
Issues Resolved
[List any issues this PR will resolve]
Check List
- [X] Commits are signed per the DCO using --signoff
By submitting this pull request, I confirm that my contribution is made under the terms of the BSD-3-Clause License.
This would be awesome to have!
This is a good start. I like having the compatibility matrices. Might be good, though, to also add a simple "Getting Started Examples"?
Maybe an example on how to write to a dataframe in scala?
e.g.,
val spark = SparkSession.builder().master("local[*]") .config("opensearch.nodes", "127.0.0.1").config("opensearch.net.http.auth.user", "admin").config("opensearch.net.http.auth.pass", "admin").config("opensearch.net.ssl", "true") .config("opensearch.batch.size.bytes", "1kb").config("opensearch.net.ssl.cert.allow.self.signed", "true") .getOrCreate()or how to use it with pyspark like I demonstrate in my comment on #153.
I'm happy to add if you'd like.
Go for it!
Is anyone making any updates to this (@nknize )? We are targeting next week to publish it. Thanks!
Is anyone making any updates to this (@nknize )? We are targeting next week to publish it. Thanks!
Yes. I'll put the example in tomorrow.
Is anyone making any updates to this (@nknize )? We are targeting next week to publish it. Thanks!
Yes. I'll put the example in tomorrow.
Hi Nick, this is still awaiting your input. Thank you!!
@nknize @mnkugler @wbeckler - If you can make the final edits, update he blog date, and let @krisfreedain know when it's ready to go, we can get this posted to the blog tomorrow. Otherwise, we'll need to hold this until next Wednesday.
Otherwise, we'll need to hold this until next Wednesday.
Let's hold to Wednesday. I was working up the example with the published artifacts and noticed they don't support Spark 3. We may want to republish the Spark 3 artifacts before publishing the blog.
@mnkugler and @wbeckler - Are we good to publish this today?
Still waiting on @nknize's changes.
@pajuric The blocker right now is that the released OpenSearched-Hadoop artifacts are not compatible with Spark 3. Thus the compatibility matrix in this blog post is not correct and the example code I'm providing will not work for the users / readers running Spark 3:
e.g.,
[error] Modules were resolved with conflicting cross-version suffixes in ProjectRef(uri("file:/...
[error] org.apache.spark:spark-core _2.13, _2.11
From example build.sbt
ThisBuild / scalaVersion := "2.13.0"
lazy val root = (project in file("."))
.settings(
name := "opensearch-spark-example"
)
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "3.2.4" exclude("javax", "servlet") exclude("org.apache", "hadoop"),
"org.opensearch.client" % "opensearch-hadoop" % "1.0.1",
"org.antlr" % "antlr4-runtime" % "4.8",
"org.codehaus.janino" % "commons-compiler" % "3.0.8",
"org.codehaus.janino" % "janino" % "3.0.8"
)
We need to publish the Spark 3 compatible version which is built and packaged with the artifacts from the spark/sql-30 module
I opened an issue to move this forward: https://github.com/opensearch-project/opensearch-hadoop/issues/304
@vagimeli @nknize - Just checking the status on this blog to see if there are any updates?
@vagimeli @nknize - Just checking the status on this blog to see if there are any updates?
@pajuric I've not heard from the authors in a while. I'm adding them to this comment, as they need to provide the update.
@nknize @harshavamsi Please update on the status of this blog. Is the text final and ready for an editorial review?
@wbeckler @Xtansia - Please provide an update on the blog, as I understand it has been transferred over to you both.
@wbeckler - Are we OK to close this blog?