incubator-livy icon indicating copy to clipboard operation
incubator-livy copied to clipboard

[Livy 899] The state of interactive session is always idle when using thrift protocol.

Open jianzhenwu opened this issue 2 years ago • 4 comments

What changes were proposed in this pull request?

  1. Should broadcast ReplState to the RSCClient in RSCDriver or the session state is always idle when using thrift protocol.
  2. Add unit test testSessionState in ThriftSessionTest to verify that the state change is as expected.

https://issues.apache.org/jira/browse/LIVY-899

How was this patch tested?

This patch is tested by unit tests. The the state change is recorded in replStateChangedHistoryInTest when testing.

jianzhenwu avatar Dec 18 '22 04:12 jianzhenwu

Codecov Report

Merging #377 (d062281) into master (314f2de) will decrease coverage by 36.98%. The diff coverage is 0.00%.

:exclamation: Current head d062281 differs from pull request most recent head 322af8e. Consider uploading reports for the commit 322af8e to get more accurate results

@@              Coverage Diff              @@
##             master     #377       +/-   ##
=============================================
- Coverage     65.58%   28.61%   -36.98%     
+ Complexity      952      379      -573     
=============================================
  Files           103      103               
  Lines          6044     6057       +13     
  Branches        911      912        +1     
=============================================
- Hits           3964     1733     -2231     
- Misses         1536     3977     +2441     
+ Partials        544      347      -197     
Impacted Files Coverage Δ
...rc/main/java/org/apache/livy/rsc/BaseProtocol.java 0.00% <ø> (-52.13%) :arrow_down:
...c/src/main/java/org/apache/livy/rsc/RSCClient.java 0.00% <0.00%> (-72.79%) :arrow_down:
...ava/org/apache/livy/rsc/driver/JobContextImpl.java 0.00% <0.00%> (-86.67%) :arrow_down:
...in/java/org/apache/livy/rsc/driver/JobWrapper.java 0.00% <0.00%> (-88.58%) :arrow_down:
...ain/java/org/apache/livy/rsc/driver/RSCDriver.java 0.00% <0.00%> (-80.92%) :arrow_down:

... and 82 files with indirect coverage changes

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

codecov-commenter avatar Dec 18 '22 04:12 codecov-commenter

I am not sure about the change, as this does not involve only the thriftserver module, but all interactive sessions. If the problem involves all interactive sessions, then it is ok, otherwise the fix is probably not the correct one. Can you check if the same issue affects all interactive jobs? If so, I think we can go ahead with this PR, but I would recommend a different reviewer more familiar with the code in rsc, otherwise, the fix needs to be done only on the thriftserver part and we need more investigation to understand why normal interactive sessions do not have the issue while thriftserver has. Thanks.

The same issue doesn't affect interactive jobs which created by REST API. The is because the handle functions are different in the driver. The REST API uses handle(ctx: ChannelHandlerContext, msg: BaseProtocol.ReplJobRequest) in the ReplDriver while thrift API uses handle(ChannelHandlerContext ctx, JobRequest<?> msg) in the RSCDRiver.
ReplDriver changes state when execute code by invoking stateChangedCallback which broadcast the state to all RPC clients. RSCDriver does not broadcast state to RPC clients. I think this patch would not affect other interactive jobs except using thrift API.

jianzhenwu avatar Dec 19 '22 02:12 jianzhenwu

This fix is causing many changes. The submit method is used for many purposes, such as adding jar, files, so I do not think is a good idea to have a fix that has such a large impact on other cases that should not be affected. The current fix would also set to busy when fetching the results, and executing schema operations, and even while unregistering a session...

I think a better place for the fix would be SqlJob, although it is not a perfect solution either. So I would be very happy if someone comes with a better alternative.

mgaido91 avatar Dec 19 '22 21:12 mgaido91

This fix is causing many changes. The submit method is used for many purposes, such as adding jar, files, so I do not think is a good idea to have a fix that has such a large impact on other cases that should not be affected. The current fix would also set to busy when fetching the results, and executing schema operations, and even while unregistering a session...

I think a better place for the fix would be SqlJob, although it is not a perfect solution either. So I would be very happy if someone comes with a better alternative.

sorry for the late reply. I have migrated the changes to SqlJob, can you @mgaido91 review the code again?

jianzhenwu avatar Jun 18 '23 07:06 jianzhenwu