streaming-at-scale
streaming-at-scale copied to clipboard
eventhubs-databricks-eventhubs
How should I write the data back to eventhubs? When I read it in from the input eventhubs it's in binary format, so should i write it back to the output eventhubs in binary format as well? Tagging @jcocchi or please let me know who else I should tag too!
Data in EventHub is in binary format, but you should sent to it just in plain text (if you're sending JSON). It will stored in the binary format automatically.
You need to put the body in a column called 'body'
https://github.com/Azure/azure-event-hubs-spark/blob/master/docs/structured-streaming-eventhubs-integration.md#creating-an-eventhubs-sink-for-streaming-queries
To generate JSON from a struct: https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/functions.html#to_json-org.apache.spark.sql.Column-
I have an example of reading from Event Hub and then writing back to an Event Hub using Structured Streaming here: https://github.com/mpfishe2/az-databricks-realtime-alert-system/blob/master/Real-Time%20Alerting.ipynb. Its a simple example.