kyuubi icon indicating copy to clipboard operation
kyuubi copied to clipboard

[Umbrella] Flink Engine Improvement and Quality Assurance

Open yaooqinn opened this issue 2 years ago • 5 comments

Code of Conduct

Search before asking

  • [X] I have searched in the issues and found no similar issues.

Describe the proposal

We introduced the Flink engine in https://github.com/apache/incubator-kyuubi/issues/1322.

In this ticket, we collect feedback, improvements, bugfixes, aim to make it production-ready

Task list

Bugs

Improvement

  • [x] #2002 @link3280
  • [ ] #1652 @yanghua
  • [ ] #1865 @SteNicholas
  • [ ] #2159 @deadwind4
  • [x] #2405
  • [ ] #2252

Documentations

Brainstorming

Miscs

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

yaooqinn avatar Mar 11 '22 02:03 yaooqinn

@yaooqinn, the module label should be flink, not hive.

SteNicholas avatar Mar 11 '22 02:03 SteNicholas

@yaooqinn, the module label should be flink, not hive.

oops..

yaooqinn avatar Mar 11 '22 02:03 yaooqinn

@yaooqinn shall we make this a KPIP and let the corresponding issues follow the naming pattern like [SUBTASK][KPIP-X]?

link3280 avatar Apr 18 '22 11:04 link3280

I am not sure that we can propose a KPIP on the status of this ticket, which seems not to meet the requirement of a KPIP.

In fact, we shall not create subtasks for KPIP-2 as it has been resolved. [SUBTASK][#2100] may be enough?

yaooqinn avatar Apr 18 '22 12:04 yaooqinn

@yaooqinn LGTM

link3280 avatar Apr 18 '22 13:04 link3280

Postpone to 1.8, because this feature is not under rapid development, and it's not supposed to be accomplished in a short time.

pan3793 avatar Feb 07 '23 06:02 pan3793

The jdbc interface supports asynchronous real-time tasks to obtain results. Can this be done? @pan3793

waywtdcc avatar Mar 18 '23 08:03 waywtdcc

@waywtdcc technically, I don't think there is any blocker in Kyuubi framework, the JDBC driver retrieves result from Kyuubi Server in mini-batch, and we do similar thing in Spark which called incremental collection.

So it could be true if the Flink engine can return the streaming data in an Iterator.

cc the Flink experts @SteNicholas @link3280 @yanghua

pan3793 avatar Mar 18 '23 14:03 pan3793

@waywtdcc are you using Flink 1.14? Actually, the Kyuubi community is going to add support for Flink 1.17 and drop support for Flink 1.14, because of the lack of developer resources.

It would be great if you can share more about your use case / challenge / expectation on Kyuubi Flink egnine :)

pan3793 avatar Mar 18 '23 14:03 pan3793

@waywtdcc are you using Flink 1.14? Actually, the Kyuubi community is going to add support for Flink 1.17 and drop support for Flink 1.14, because of the lack of developer resources.

It would be great if you can share more about your use case / challenge / expectation on Kyuubi Flink egnine :)

We use flink1.14 for data synchronization and real-time computing

waywtdcc avatar Mar 20 '23 01:03 waywtdcc

@waywtdcc technically, I don't think there is any blocker in Kyuubi framework, the JDBC driver retrieves result from Kyuubi Server in mini-batch, and we do similar thing in Spark which called incremental collection.

So it could be true if the Flink engine can return the streaming data in an Iterator.

cc the Flink experts @SteNicholas @link3280 @yanghua

Ok, I see. So what if I need to get the historical checkpoint list and stop after executing the savepoint operation?

waywtdcc avatar Mar 20 '23 01:03 waywtdcc

All things you need to do is construct a proper FetchIterator on the Flink engine side.

pan3793 avatar Mar 20 '23 03:03 pan3793

@waywtdcc technically, I don't think there is any blocker in Kyuubi framework, the JDBC driver retrieves result from Kyuubi Server in mini-batch, and we do similar thing in Spark which called incremental collection. So it could be true if the Flink engine can return the streaming data in an Iterator. cc the Flink experts @SteNicholas @link3280 @yanghua

Ok, I see. So what if I need to get the historical checkpoint list and stop after executing the savepoint operation?

@waywtdcc There're on-going efforts on Flink to improve the savepoint management via SQLs (see FLIP-222 for details). Kyuubi will support these statements once they are available.

link3280 avatar Mar 20 '23 06:03 link3280

Add a jar package, how to execute a certain method of this jar package?

waywtdcc avatar Mar 21 '23 03:03 waywtdcc

All things you need to do is construct a proper FetchIterator on the Flink engine side.

Yes, we also need to get the resulting data in a streaming manner.

waywtdcc avatar Mar 21 '23 08:03 waywtdcc