pulsar-spark icon indicating copy to clipboard operation
pulsar-spark copied to clipboard

[FEATURE] upgrade Spark DataSource V2 APIs for unify streaming and batch

Open jiazhai opened this issue 4 years ago • 3 comments

Thanks to @hanguyen6's proposal. in PR #42

Spark DataSource V2 APIs has been refactored to unify between streaming and batch. Should we upgrade our code to follow the updated abstraction:

https://issues.apache.org/jira/browse/SPARK-25390

write API refactor: https://docs.google.com/document/d/1vI26UEuDpVuOjWw4WPoH2T6y8WAekwtI7qoowhOFnI4/edit#

read API refactoring: https://docs.google.com/document/d/1uUmKCpWLdh9vHxP7AWJ9EgbwB_U6T3EJYNjhISGmiQg/edit#

I have updated code to follow that abstraction and can help if needed.

jiazhai avatar Feb 23 '21 10:02 jiazhai

@hanguyen6 would you like to work on this issue?

jiazhai avatar Feb 23 '21 10:02 jiazhai

@jiazhai Yes, I am working on it right now. Will update when it completes

hanguyen6 avatar Feb 23 '21 14:02 hanguyen6

@jiazhai, @jianyun8023 Could you pls review PR #50 ?

hanguyen6 avatar Jun 29 '21 00:06 hanguyen6