hudi icon indicating copy to clipboard operation
hudi copied to clipboard

Caused by: org.apache.http.NoHttpResponseException: xxxxxx:34812 failed to respond[SUPPORT]

Open Aload opened this issue 3 years ago • 2 comments

Tips before filing an issue

  • Have you gone through our FAQs?

  • Join the mailing list to engage in conversations and get faster support at [email protected].

  • If you have triaged this as a bug, then file an issue directly.

Describe the problem you faced

When the program has been running for a while, the following problems repeatedly occur. image

To Reproduce

Expected behavior

A clear and concise description of what you expected to happen.

Environment Description

  • Hudi version : 0.12.0

  • Spark version :3.2.1

  • Hive version :2.3.7

  • Hadoop version :3.0.0

  • Storage (HDFS/S3/GCS..) : HDFS

  • Running on Docker? (yes/no) :no

Additional context

Add any other context about the problem here.

Stacktrace

org.apache.hudi.exception.HoodieRemoteException: 10.0.20.51:34812 failed to respond
	at org.apache.hudi.common.table.view.RemoteHoodieTableFileSystemView.getPendingCompactionOperations(RemoteHoodieTableFileSystemView.java:438) ~[hudi-flink1.14-bundle-0.12.0.jar:0.12.0]
	at org.apache.hudi.common.table.view.PriorityBasedFileSystemView.execute(PriorityBasedFileSystemView.java:68) ~[hudi-flink1.14-bundle-0.12.0.jar:0.12.0]
	at org.apache.hudi.common.table.view.PriorityBasedFileSystemView.getPendingCompactionOperations(PriorityBasedFileSystemView.java:224) ~[hudi-flink1.14-bundle-0.12.0.jar:0.12.0]
	at org.apache.hudi.table.action.compact.ScheduleCompactionActionExecutor.scheduleCompaction(ScheduleCompactionActionExecutor.java:117) ~[hudi-flink1.14-bundle-0.12.0.jar:0.12.0]
	at org.apache.hudi.table.action.compact.ScheduleCompactionActionExecutor.execute(ScheduleCompactionActionExecutor.java:93) ~[hudi-flink1.14-bundle-0.12.0.jar:0.12.0]
	at org.apache.hudi.table.HoodieFlinkMergeOnReadTable.scheduleCompaction(HoodieFlinkMergeOnReadTable.java:109) ~[hudi-flink1.14-bundle-0.12.0.jar:0.12.0]
	at org.apache.hudi.client.BaseHoodieWriteClient.scheduleTableServiceInternal(BaseHoodieWriteClient.java:1353) ~[hudi-flink1.14-bundle-0.12.0.jar:0.12.0]
	at org.apache.hudi.client.BaseHoodieWriteClient.scheduleTableService(BaseHoodieWriteClient.java:1330) ~[hudi-flink1.14-bundle-0.12.0.jar:0.12.0]
	at org.apache.hudi.client.BaseHoodieWriteClient.scheduleCompactionAtInstant(BaseHoodieWriteClient.java:1009) ~[hudi-flink1.14-bundle-0.12.0.jar:0.12.0]
	at org.apache.hudi.client.BaseHoodieWriteClient.scheduleCompaction(BaseHoodieWriteClient.java:1000) ~[hudi-flink1.14-bundle-0.12.0.jar:0.12.0]
	at org.apache.hudi.util.CompactionUtil.scheduleCompaction(CompactionUtil.java:65) ~[hudi-flink1.14-bundle-0.12.0.jar:0.12.0]
	at org.apache.hudi.sink.StreamWriteOperatorCoordinator.lambda$notifyCheckpointComplete$2(StreamWriteOperatorCoordinator.java:246) ~[hudi-flink1.14-bundle-0.12.0.jar:0.12.0]
	at org.apache.hudi.sink.utils.NonThrownExecutor.lambda$wrapAction$0(NonThrownExecutor.java:130) ~[hudi-flink1.14-bundle-0.12.0.jar:0.12.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
Caused by: org.apache.http.NoHttpResponseException: 10.0.20.51:34812 failed to respond
	at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143) ~[anso-process-0.0.1.jar:?]
	at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57) ~[anso-process-0.0.1.jar:?]
	at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261) ~[anso-process-0.0.1.jar:?]
	at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:165) ~[anso-process-0.0.1.jar:?]
	at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:167) ~[anso-process-0.0.1.jar:?]
	at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:272) ~[anso-process-0.0.1.jar:?]
	at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:124) ~[anso-process-0.0.1.jar:?]
	at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:271) ~[anso-process-0.0.1.jar:?]
	at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184) ~[anso-process-0.0.1.jar:?]
	at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88) ~[anso-process-0.0.1.jar:?]
	at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) ~[anso-process-0.0.1.jar:?]
	at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184) ~[anso-process-0.0.1.jar:?]
	at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82) ~[anso-process-0.0.1.jar:?]
	at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107) ~[anso-process-0.0.1.jar:?]
	at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55) ~[anso-process-0.0.1.jar:?]
	at org.apache.http.client.fluent.Request.execute(Request.java:151) ~[hudi-flink1.14-bundle-0.12.0.jar:0.12.0]
	at org.apache.hudi.common.table.view.RemoteHoodieTableFileSystemView$HttpRequestCheckedFunction.get(RemoteHoodieTableFileSystemView.java:526) ~[hudi-flink1.14-bundle-0.12.0.jar:0.12.0]
	at org.apache.hudi.common.table.view.RemoteHoodieTableFileSystemView.executeRequest(RemoteHoodieTableFileSystemView.java:184) ~[hudi-flink1.14-bundle-0.12.0.jar:0.12.0]
	at org.apache.hudi.common.table.view.RemoteHoodieTableFileSystemView.getPendingCompactionOperations(RemoteHoodieTableFileSystemView.java:434) ~[hudi-flink1.14-bundle-0.12.0.jar:0.12.0]
	... 15 more
2022-09-07 06:54:16,271 WARN  org.apache.hudi.common.table.view.PriorityBasedFileSystemView [] - Routing request to secondary file-system view.```

Aload avatar Sep 07 '22 02:09 Aload

I have encountered this problem,this pr may solve your problem : https://github.com/apache/hudi/pull/6393

LinMingQiang avatar Sep 09 '22 02:09 LinMingQiang

@Aload can you verify if the patch is used in your version of hudi? and still having the problem?

I have encountered this problem,this pr may solve your problem : #6393

in order to help diagnose, we need more info also to reproduce it. like configs and code snippet

xushiyan avatar Sep 15 '22 01:09 xushiyan

@Aload can you verify if the patch is used in your version of hudi? and still having the problem?

I have encountered this problem,this pr may solve your problem : #6393

in order to help diagnose, we need more info also to reproduce it. like configs and code snippet

yes version 0.12.0

Aload avatar Oct 09 '22 02:10 Aload

@danny0405 @yuzhaojing : have you guys seen the exception before. any pointers.

nsivabalan avatar Oct 22 '22 23:10 nsivabalan

@danny0405 @yuzhaojing : have you guys seen the exception before. any pointers.

yes I had the same problem when I was in 0.11

Aload avatar Oct 28 '22 10:10 Aload

@danny0405 @yuzhaojing : have you guys seen the exception before. any pointers.

image I just upgraded 0.12.1 yesterday and the same problem still occurs frequently

Aload avatar Nov 08 '22 01:11 Aload

Try the option hoodie.filesystem.view.remote.retry.enable which is introduced in 0.12.1, it expects to solve the problem. Feel free to reopen it if the problem still exists.

danny0405 avatar Nov 08 '22 02:11 danny0405