bookkeeper icon indicating copy to clipboard operation
bookkeeper copied to clipboard

Fix bookie process still alive when bookie startup is not successful and shutdown

Open TakaHiR07 opened this issue 2 years ago • 9 comments

Motivation

When bookie startup encounter IOException in BookieImpl#readJournal(), bookie startup is not successful and then trigger shutdown(). However, bookie process is still alive.

The reason is IOException is caught in BookieImpl and trigger shutdown. Bookie actually is not running. But the exception do not throw to startComponent.

So BookieServer.main / server.Main.doMain still wait for the startComponent future to complete.

The relevant code is as following:

https://github.com/apache/bookkeeper/blob/3221aa30924825cb4c1a5b00fb68dec44712946e/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/BookieImpl.java#L654-L660

https://github.com/apache/bookkeeper/blob/3221aa30924825cb4c1a5b00fb68dec44712946e/bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/BookieServer.java#L123-L131

https://github.com/apache/bookkeeper/blob/3221aa30924825cb4c1a5b00fb68dec44712946e/bookkeeper-common/src/main/java/org/apache/bookkeeper/common/component/ComponentStarter.java#L63-L87

Changes

throw exception to startComponent when bookie start encounter error.

TakaHiR07 avatar Oct 18 '23 10:10 TakaHiR07

@TakaHiR07 Could you take a look at the check style issue? Thanks.

hangc0276 avatar Oct 26 '23 06:10 hangc0276

@TakaHiR07 Could you take a look at the check style issue? Thanks.

have fix checkstyle.

TakaHiR07 avatar Oct 26 '23 06:10 TakaHiR07

@TakaHiR07 Would you please help fix the failed tests? Thanks.

hangc0276 avatar Mar 05 '24 07:03 hangc0276

@TakaHiR07 Would you please help fix the failed tests? Thanks.

@hangc0276 I have fix the test since powermock is removed in master branch.

TakaHiR07 avatar Mar 06 '24 08:03 TakaHiR07

update the branch

hezhangjian avatar May 15 '24 05:05 hezhangjian

Normally, "Main" thread will wait for "component-shutdown-thread" thread to complete the "component start future". Your edition will cause future complete too soon, and main thread exit, which may cause shutdown not finished and process exit?

xiang092689 avatar May 15 '24 07:05 xiang092689

Normally, "Main" thread will wait for "component-shutdown-thread" thread to complete the "component start future". Your edition will cause future complete too soon, and main thread exit, which may cause shutdown not finished and process exit?

"component start future" would complete only when exception throw. if bookie.start() encounter exception, it may catch some exception and go into shutdown(). only shutdown finish, then bookie.isRunning() become false, then go into the edition code.

TakaHiR07 avatar May 15 '24 07:05 TakaHiR07

I think it's not an APIException, should we define this exception in https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/ExitCode.java ?

ExitCode looks like pointing out the exact reason why bookie start fail. It seems not appropriate to put this exception here, this exception only point out that bookie start fail.

TakaHiR07 avatar May 15 '24 07:05 TakaHiR07

Many failed test shows that in previous design, bookieServer start failed, but not throw exception. I think these test is also unreasonable and should be fixed. Looking forward to more views, then I would fix these test.

TakaHiR07 avatar May 15 '24 08:05 TakaHiR07