fluentd icon indicating copy to clipboard operation
fluentd copied to clipboard

Buffer file format

Open waqarsky opened this issue 2 years ago • 1 comments

Describe the bug

I know fluentd buffers have their own binary encoding but strings which I expect to be the same in the buffer file have inconsistencies

To Reproduce

cat --show-nonprinting buffer.b5ee47fc4ebdc5a82c9c3788851a80181.log

Expected behavior

sky.top_tenant should be consistent in the whole file. The strings below should all be the same but they contain different non-printable characters.

The following are 3 separate examples from the cat command of the buffer file

sky.top_tenantM-$waveM
skyM-^AM-*top_tenantM-(identityM
sky.top_tenantM-$waveM

The input being passed in is: sky.top_tenant

In the same buffer file we have we have sky.top_tenant being shown differently when the input is the same

Your Environment

- Fluentd version: 1.15.2
- TD Agent version: N/A
- Operating system: Ubuntu
- Kernel version: 5.15.0-1022-aws

Your Configuration

<match {logdocument.tenant_in_mono.fluentd,tenant_in_mono.filebeat,tenant_in_mono.functionbeat,logdocument.tenant_in_mono.fluentd_ssl}>
  @type copy
  @log_level info
  <store>
      @type elasticsearch
    reconnect_on_error true
    reload_on_failure true
    reload_connections false
    max_retry_putting_template 1
    request_timeout 60s
    fail_on_putting_template_retry_exceed false
    slow_flush_log_threshold 100.0
    @id        out_es_logs-tenant_in_mono
    @log_level info
    log_es_400_reason true

    id_key      _hash
    remove_keys _hash

    hosts {redacted}
    user "{redacted}"
    password "{redacted}"
    ca_file "/etc/fluentd/aaa.crt"
    ssl_version TLSv1_2
    ssl_verify false

    index_name               logs-${sky.top_tenant}-fluentd
    time_key                 time
    include_timestamp        true
    include_tag_key          true
    flatten_hashes           false
    flatten_hashes_separator _

    # Rollover index config
    rollover_index     true
    application_name   default
    index_date_pattern "now/d"
    deflector_alias    logs-${sky.top_tenant}-fluentd

    # Index template
    template_name      logs-${sky.top_tenant}-fluentd
    template_file      /etc/fluentd/logs-template.json
    customize_template {"<<TAG>>":"${sky.top_tenant}"}
    template_overwrite true
    <buffer tag,sky.top_tenant>
      retry_wait 20s
      retry_exponential_backoff_base 2
      retry_type exponential_backoff
      retry_max_interval 300s
      disable_chunk_backup true
      @type file
      path /fluentd/es-out-logs-tenant_in_mono

      flush_thread_count 8
      flush_interval     5s
      flush_at_shutdown  true
      overflow_action block
      chunk_limit_size 16M
      # total_limit_size is set 70% of the data disk do that 1 single out can't use more than this
      total_limit_size   137G
      retry_forever      false
    </buffer>
  </store>
</match>

Your Error Log

There are errors in the log but unsure if they are related

2022-11-25 11:21:15 +0000 [warn]: #6 [out_es_logs-tenant_in_mono] failed to flush the buffer. retry_times=4 next_retry_time=2022-11-25 1
1:26:33 +0000 chunk="5ee25623683db952a383785930d688c3" error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error=
"could not push logs to Elasticsearch cluster ({:host=>\"{redacted}\", :port=>9200
, :scheme=>\"https\", :user=>\"{redacted}\", :password=>\"obfuscated\"}): EOFError (EOFError)"

Additional context

No response

waqarsky avatar Nov 25 '22 11:11 waqarsky

@fujimotos Is there any more info on this?

waqarsky avatar Dec 07 '22 11:12 waqarsky