Hbase-28030: Adding unit test for FileNotFoundException when split a wal in SplitLogWorker
Basically, we add a unit test for reproducing the FileNotFoundException encountered in SplitLogWorker.java in case [HBase-20583].
Here is the workflow of reproducing that. First of all, the WAL file directory will be renamed to a new directory with “-splitting” postfix by the SplitLogManager at the Master. Then the SpitLogWorker at the region server would grab the task to finish it and after finishing ZkWatcher at master would first delete the WAL file and then delete the corresponding zknode.
When an IOException happening in between deleting WAL and deleting zknode, there would be condition check to decide whether to schedule that task again. If the regionerServer that executed that task dead when the resubmit check was being performed, (needs a very special crash time: after the task is finished but before the resubmitting is checked ), the task would be resubmitted but at the new round, because the WAL is deleted, we would encounter the FileNotFoundException as described in [HBase-20583].
To reproduce the unit test, we added a new HBaseFaultInjector class to inject the IOException and regionerserver shutdown after finishing the task but before the resubmitting check.
Thanks for opening a PR. In HBase, we usually first open a PR against the master branch, and then cherry-pick to other branches.
So please open a PR against master branch? Or does this only affect branch-2.3?