nebula-spark-utils icon indicating copy to clipboard operation
nebula-spark-utils copied to clipboard

import data inside kafka value to nebula

Open sworduo opened this issue 3 years ago • 3 comments

This is the resolution for issue https://github.com/vesoft-inc/nebula-spark-utils/issues/130 (import data from Kafka to Nebula). In this update, it is supported by nebula-exchange to parse data from the value field of Kafka and import which to Nebula. It's worth noting that other fields included in Kafka like offset, key,etc are abandoned. Meanwhile, since Kafka is streaming data, it's impossible to switch data source once Kafka is chosen, which means the tag/edge defined in configuration can only be parsed from Kafka. Hence, the Kafka config is defined independently instead of indicated inside the tag/edge config. In this case, all tag/edge share the same Kafka config. More details can be found in the accompanying README-CN.md.

sworduo avatar Sep 28 '21 03:09 sworduo

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Sep 28 '21 03:09 CLAassistant

Thank you so much @sworduo, this PR makes real-world Kafka streaming source Usability to the next level.

@Nicole00 🎉

wey-gu avatar Sep 28 '21 07:09 wey-gu

Thanks for your pr to support the parsing for kafka‘s value. This pr changes the architecture of Exchange showed in doc https://docs.nebula-graph.com.cn/2.5.1/nebula-exchange/about-exchange/ex-ug-what-is-exchange/, can we just modify the StreamingReader to parse the kafka's value to DataFrame?

Nicole00 avatar Oct 12 '21 01:10 Nicole00