fluent-plugin-webhdfs
fluent-plugin-webhdfs copied to clipboard
Operation category READ is not supported in state standby
Hi All, Please help to solve this issue. we are using fluentd as our central log forwarder. we are getting lot of backlog records in file buffer due to webhdfs unable to flush to hadoop. This is due to fluentd getting "Read not support in standby" and it succeed after fluentd retry. There is no issue when Primary hadoop node is active. But when hadoop active node become standby and standby become active and switch back, fluentd not able to reset to previous active node. please advise with correct settings to for handling HA in webhdfs plugin.
namenode $namenode_host standby_namenode $standy_host
2021-09-02 13:06:57 +0000 [warn]: #11 [out_hdfs] failed to communicate hdfs cluster, path: /xxx/hadoopfs/2021/09/02/srv/20210902_1304_5cb02d28f56a255edbe3c8364e3c3c3a.bz2 2021-09-02 13:06:57 +0000 [warn]: #11 [out_hdfs] webhdfs check request failed. (namenode: $hadoohost:port, error: {"RemoteException":{"exception":"StandbyException","javaClassName":"org.apache.hadoop.ipc.StandbyException","message":"Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error"}}) 2021-09-02 13:06:57 +0000 [warn]: #11 [out_hdfs] failed to flush the buffer. retry_time=0 next_retry_seconds=2021-09-02 13:06:58.650648326 +0000 chunk="5cb02d28f56a255edbe3c8364e3c3c3a" error_class=WebHDFS::ServerError error="Failed to connect to host $hadoohost:port, Net::ReadTimeout with #TCPSocket:(closed)" 2021-09-02 13:06:57 +0000 [warn]: #11 suppressed same stacktrace 2021-09-02 13:06:58 +0000 [warn]: #11 [out_hdfs] retry succeeded. chunk_id="5cb02d28f56a255edbe3c8364e3c3c3a"