fluent-plugin-webhdfs icon indicating copy to clipboard operation
fluent-plugin-webhdfs copied to clipboard

Operation category READ is not supported in state standby

Open lakhsaaron opened this issue 3 years ago • 0 comments

Hi All, Please help to solve this issue. we are using fluentd as our central log forwarder. we are getting lot of backlog records in file buffer due to webhdfs unable to flush to hadoop. This is due to fluentd getting "Read not support in standby" and it succeed after fluentd retry. There is no issue when Primary hadoop node is active. But when hadoop active node become standby and standby become active and switch back, fluentd not able to reset to previous active node. please advise with correct settings to for handling HA in webhdfs plugin.

namenode $namenode_host standby_namenode $standy_host

2021-09-02 13:06:57 +0000 [warn]: #11 [out_hdfs] failed to communicate hdfs cluster, path: /xxx/hadoopfs/2021/09/02/srv/20210902_1304_5cb02d28f56a255edbe3c8364e3c3c3a.bz2 2021-09-02 13:06:57 +0000 [warn]: #11 [out_hdfs] webhdfs check request failed. (namenode: $hadoohost:port, error: {"RemoteException":{"exception":"StandbyException","javaClassName":"org.apache.hadoop.ipc.StandbyException","message":"Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error"}}) 2021-09-02 13:06:57 +0000 [warn]: #11 [out_hdfs] failed to flush the buffer. retry_time=0 next_retry_seconds=2021-09-02 13:06:58.650648326 +0000 chunk="5cb02d28f56a255edbe3c8364e3c3c3a" error_class=WebHDFS::ServerError error="Failed to connect to host $hadoohost:port, Net::ReadTimeout with #TCPSocket:(closed)" 2021-09-02 13:06:57 +0000 [warn]: #11 suppressed same stacktrace 2021-09-02 13:06:58 +0000 [warn]: #11 [out_hdfs] retry succeeded. chunk_id="5cb02d28f56a255edbe3c8364e3c3c3a"

lakhsaaron avatar Sep 07 '21 11:09 lakhsaaron