zeppelin icon indicating copy to clipboard operation
zeppelin copied to clipboard

[ZEPPELIN-4031] Fixed Unable to detect that the interpreter process was killed manually

Open xunliu opened this issue 6 years ago • 5 comments

What is this PR for?

Zeppelin-server can't perceive, When a network exception occurs Or a state where a program exception causes the interpreter process to be unavailable, Cause the user to fail to perform the task all the time. You need to restart the interpreter process to get the user's interpreter back to normal, Cause a bad user experience.

By detecting the state of the remote interpreter process in Zeppelin server, When an interpreter process exception is found, By cleaning up the session of this interpreter, Let the interpreter regain its availability, Improves the user experience, It also reduces the operation and maintenance burden of the system.

What type of PR is it?

[Bug Fix]

What is the Jira issue?

  • https://issues.apache.org/jira/browse/ZEPPELIN-4031

How should this be tested?

RemoteInterpreterTest::testDetectIntpProcessKilled()

Screenshots (if appropriate)

Before bug fix

4031-BUG

After bug fix

4031-Fixed

Questions:

  • Does the licenses files need update?
  • Is there breaking changes for older versions?
  • Does this needs documentation?

xunliu avatar Mar 25 '19 11:03 xunliu

@zjffdu, @felixcheung , @jongyoul , Please help me review the code, Thanks!

xunliu avatar Mar 25 '19 13:03 xunliu

This bug, It's not easy to get through the code review to understand the situation. Better way, It is verified by testing.

xunliu avatar Mar 26 '19 07:03 xunliu

ok thanks for explaining. im ok with this. IMO might be worthwhile to refactor the code to make it more straightforward perhaps?

felixcheung avatar Mar 27 '19 05:03 felixcheung

No need to click the second time to execute successfully. Restore the interpreter process immediately

4031-Fixed-2

xunliu avatar Mar 27 '19 08:03 xunliu

CI Pass https://travis-ci.org/liuxunorg/zeppelin/builds/512089147 @zjffdu , @felixcheung , Please help me review code. In the second commit, I adjusted the way to recreate the invalid remote interpreter process. It is now very elegant to fix remote invalid interpreters, It can be used immediately without any manual intervention. See the GIF effect: https://github.com/apache/zeppelin/pull/3342#issuecomment-477046336

xunliu avatar Mar 28 '19 02:03 xunliu