hudi icon indicating copy to clipboard operation
hudi copied to clipboard

[HUDI-4412] Multiple writers NPE when Insert_overwrite

Open liujinhui1994 opened this issue 2 years ago • 9 comments

Tips

  • Thank you very much for contributing to Apache Hudi.
  • Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.

What is the purpose of the pull request

There are two minor issues fixed here;

  1. When the insert_overwrite operation is performed, the clusteringPlan in the requestedReplaceMetadata will be null, Calling getFileIdsFromRequestedReplaceMetadata will have a null pointer

  2. When insert_overwrite operation, inflightCommitMetadata!=null, getOperationType should be obtained from getHoodieInflightReplaceMetadata, the original code will have a null pointer

Committer checklist

  • [ ] Has a corresponding JIRA in PR title & commit

  • [ ] Commit message is descriptive of the change

  • [ ] CI is green

  • [ ] Necessary doc changes done or have another open PR

  • [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

liujinhui1994 avatar Jul 18 '22 07:07 liujinhui1994

cc @fengjian428 Can you also help to review and verify this PR?

yanghua avatar Jul 18 '22 10:07 yanghua

@codope please take it, thanks

liujinhui1994 avatar Jul 18 '22 11:07 liujinhui1994

@hudi-bot run azure

liujinhui1994 avatar Jul 20 '22 07:07 liujinhui1994

@hudi-bot run azure

liujinhui1994 avatar Jul 21 '22 01:07 liujinhui1994

@hudi-bot run azure

liujinhui1994 avatar Jul 27 '22 13:07 liujinhui1994

cc @fengjian428 Can you also help to review and verify this PR?

will test this tmr or on Monday

fengjian428 avatar Jul 30 '22 16:07 fengjian428

22/07/30 23:35:16 INFO ApplicationMaster: Unregistering ApplicationMaster with FAILED (diag message: User class threw exception: java.lang.NullPointerException
    at org.apache.hudi.client.transaction.ConcurrentOperation.getFileIdsFromRequestedReplaceMetadata(ConcurrentOperation.java:162)

this PR can fix the above NPE bug, but I got another error after some successful commit Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: /xxxx/.hoodie/20220731151951392.replacecommit.requested

I will keep Investigate the root cause, should not be relevant

fengjian428 avatar Jul 31 '22 15:07 fengjian428

@hudi-bot run azure

yihua avatar Sep 17 '22 07:09 yihua

CI report:

  • 639e47e513791f0fa66804b241f99077434b7a95 Azure: FAILURE
Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

hudi-bot avatar Sep 17 '22 17:09 hudi-bot

22/07/30 23:35:16 INFO ApplicationMaster: Unregistering ApplicationMaster with FAILED (diag message: User class threw exception: java.lang.NullPointerException
    at org.apache.hudi.client.transaction.ConcurrentOperation.getFileIdsFromRequestedReplaceMetadata(ConcurrentOperation.java:162)

this PR can fix the above NPE bug, but I got another error after some successful commit Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: /xxxx/.hoodie/20220731151951392.replacecommit.requested

I will keep Investigate the root cause, should not be relevant

I don't think the above issue is due to this patch as it does not attempt to change the timeline in any way. At worst, it can abort a transaction but it won't remove the requested replacecommit.

codope avatar Sep 23 '22 14:09 codope