fluent-plugin-webhdfs icon indicating copy to clipboard operation
fluent-plugin-webhdfs copied to clipboard

When try to append to file more often then default timekey, exception happens

Open bobykus31 opened this issue 6 years ago • 0 comments

Here is my config

<match iotop.0 >
    @type copy
    deep_copy true

    <store>
        @type webhdfs
        namenode          master1.hdoo.com:50070 
        standby_namenode  master2.hdoo.com:50070
        path  "/raw/iostats/%Y%m%d/%H/#{Socket.gethostname}.log"
        compress gzip  # or 'bzip2', 'snappy', 'lzo_command'
        retry_known_errors yes
        retry_interval 65
        open_timeout 180
        read_timeout 180
        append yes
        username fluent
        ignore_start_check_error true
        <buffer>
           @type   file
           flush_interval 1m
           timekey  10m
           path  /var/spool/td-agent/fluentd_buffer_fwebhdfs/
        </buffer>
        <format>
           @type csv
           fields UNIXTIME,USER,DISKREAD,DISKWRITE
           force_quotes false 
        </format>
        <secondary>
          @type secondary_file
          directory /var/spool/td-agent/webhdfs_failed/
        </secondary>
    </store>


</match>

The goal is to sync data down to hadoop every 10 min even if file is %H based. Every time the flush process start, the following issue appears

2018-11-19 13:40:31 +0000 [warn]: #19 failed to flush the buffer. retry_time=0 next_retry_seconds=2018-11-19 13:40:32 +0000 chunk="57b048883d347b0ca22650dac40e1961" error_class=WebHDFS::IOError error="{\"RemoteException\":{\"exception\":\"AlreadyBeingCreatedException\",\"javaClassName\":\"org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException\",\"message\":\"Failed to create file [/raw/iostats/20181119/13/fluentd1.log.gz] for [DFSClient_NONMAPREDUCE_1591351286_116] for client [10.64.11.19], because this file is already being created by [DFSClient_NONMAPREDUCE_1898391037_115] on [10.64.11.19]\\n\\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3276)\\n\\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:3056)\\n\\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:3341)\\n\\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:3305)\\n\\tat org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:621)\\n\\tat org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.append(AuthorizationProviderProxyClientProtocol.java:129)\\n\\tat org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:433)\\n\\tat org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)\\n\\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)\\n\\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)\\n\\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2226)\\n\\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2222)\\n\\tat java.security.AccessController.doPrivileged(Native Method)\\n\\tat javax.security.auth.Subject.doAs(Subject.java:415)\\n\\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)\\n\\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2220)\\n\"}}"
  2018-11-19 13:40:31 +0000 [warn]: #19 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/webhdfs-0.8.0/lib/webhdfs/client_v1.rb:401:in `request'
  2018-11-19 13:40:31 +0000 [warn]: #19 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/webhdfs-0.8.0/lib/webhdfs/client_v1.rb:270:in `operate_requests'
  2018-11-19 13:40:31 +0000 [warn]: #19 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/webhdfs-0.8.0/lib/webhdfs/client_v1.rb:84:in `append'
  2018-11-19 13:40:31 +0000 [warn]: #19 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-webhdfs-1.2.3/lib/fluent/plugin/out_webhdfs.rb:272:in `send_data'
  2018-11-19 13:40:31 +0000 [warn]: #19 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-webhdfs-1.2.3/lib/fluent/plugin/out_webhdfs.rb:389:in `block in write'
  2018-11-19 13:40:31 +0000 [warn]: #19 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-webhdfs-1.2.3/lib/fluent/plugin/out_webhdfs.rb:335:in `compress_context'
  2018-11-19 13:40:31 +0000 [warn]: #19 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-webhdfs-1.2.3/lib/fluent/plugin/out_webhdfs.rb:388:in `write'
  2018-11-19 13:40:31 +0000 [warn]: #19 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.2.6/lib/fluent/plugin/output.rb:1111:in `try_flush'
  2018-11-19 13:40:31 +0000 [warn]: #19 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.2.6/lib/fluent/plugin/output.rb:1394:in `flush_thread_run'
  2018-11-19 13:40:31 +0000 [warn]: #19 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.2.6/lib/fluent/plugin/output.rb:445:in `block (2 levels) in start'
  2018-11-19 13:40:31 +0000 [warn]: #19 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.2.6/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'

Then next second retry with success

2018-11-19 13:40:32 +0000 [warn]: #19 retry succeeded. chunk_id="57b048883d347b0ca22650dac40e1961"

Is it expected behavior? We use Cloudera Manager 5.13.0-1.cm5130.p0.55~wheezy-cm5 as hadoop implementation.

bobykus31 avatar Nov 19 '18 14:11 bobykus31