cruise-control icon indicating copy to clipboard operation
cruise-control copied to clipboard

Handle null pointer exception in isPartitionUnderReplicated check

Open bsandeep23 opened this issue 1 year ago • 0 comments

During the execution of isPartitionUnderReplicated check, it is possible that the partition under check gets deleted. In this case, currently the operation fails with null pointer exception.

https://github.com/linkedin/cruise-control/blob/f4ca900e58944478b539955a3d70fcd802c0e1a8/cruise-control/src/main/java/com/linkedin/kafka/cruisecontrol/KafkaCruiseControlUtils.java#L786

One such exception trace when executing a demote operation:

2023/11/07 06:40:42.675 WARN [OperationRunnable] [ServletSessionExecutor-1] [kafka-cruise-control] [] Received exception when trying to execute runnable for "Demote" com.linkedin.kafka.cruisecontrol.exception.KafkaCruiseControlException: java.lang.NullPointerException at com.linkedin.kafka.cruisecontrol.servlet.handler.async.runnable.GoalBasedOperationRunnable.computeResult(GoalBasedOperationRunnable.java:167) ~[cruise-control-2.5.129.jar:?] at com.linkedin.kafka.cruisecontrol.servlet.handler.async.runnable.DemoteBrokerRunnable.getResult(DemoteBrokerRunnable.java:115) ~[cruise-control-2.5.129.jar:?] at com.linkedin.kafka.cruisecontrol.servlet.handler.async.runnable.DemoteBrokerRunnable.getResult(DemoteBrokerRunnable.java:57) ~[cruise-control-2.5.129.jar:?] at com.linkedin.kafka.cruisecontrol.servlet.handler.async.runnable.OperationRunnable.run(OperationRunnable.java:45) [cruise-control-2.5.129.jar:?] at com.linkedin.kafka.cruisecontrol.servlet.handler.async.runnable.GoalBasedOperationRunnable.run(GoalBasedOperationRunnable.java:36) [cruise-control-2.5.129.jar:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?] at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:829) [?:?] Caused by: java.lang.NullPointerException at com.linkedin.kafka.cruisecontrol.KafkaCruiseControlUtils.isPartitionUnderReplicated(KafkaCruiseControlUtils.java:785) ~[cruise-control-2.5.129.jar:?] at com.linkedin.kafka.cruisecontrol.analyzer.goals.PreferredLeaderElectionGoal.maybeMoveReplicaToEndOfReplicaList(PreferredLeaderElectionGoal.java:69) ~[cruise-control-2.5.129.jar:?] at com.linkedin.kafka.cruisecontrol.analyzer.goals.PreferredLeaderElectionGoal.optimize(PreferredLeaderElectionGoal.java:95) ~[cruise-control-2.5.129.jar:?] at com.linkedin.kafka.cruisecontrol.analyzer.GoalOptimizer.optimizations(GoalOptimizer.java:467) ~[cruise-control-2.5.129.jar:?] at com.linkedin.kafka.cruisecontrol.KafkaCruiseControl.optimizations(KafkaCruiseControl.java:605) ~[cruise-control-2.5.129.jar:?] at com.linkedin.kafka.licruisecontrol.servlet.handler.async.runnable.LiDemoteBrokerRunnable.workWithClusterModel(LiDemoteBrokerRunnable.java:77) ~[likafka-cruise-control-impl_2.12-3.2.88.jar:?] at com.linkedin.kafka.cruisecontrol.servlet.handler.async.runnable.GoalBasedOperationRunnable.computeResult(GoalBasedOperationRunnable.java:161) ~[cruise-control-2.5.129.jar:?] ... 9 more

In such cases, the expectation is to do a null check before access and handle the partition not found case gracefully.

bsandeep23 avatar Dec 10 '23 15:12 bsandeep23