fluent-plugin-secure-forward icon indicating copy to clipboard operation
fluent-plugin-secure-forward copied to clipboard

Problem with dead connection

Open doul59 opened this issue 10 years ago • 26 comments

Hi,

I have a problem with the plugin :

Forwarder :

2015-04-07 17:37:25 +0200 [info]: starting fluentd-0.12.7
2015-04-07 17:37:26 +0200 [info]: gem 'fluent-mixin-config-placeholders' version '0.3.0'
2015-04-07 17:37:26 +0200 [info]: gem 'fluent-mixin-plaintextformatter' version '0.2.6'
2015-04-07 17:37:26 +0200 [info]: gem 'fluent-plugin-mongo' version '0.7.8'
2015-04-07 17:37:26 +0200 [info]: gem 'fluent-plugin-parser' version '0.4.1'
2015-04-07 17:37:26 +0200 [info]: gem 'fluent-plugin-record-reformer' version '0.5.0'
2015-04-07 17:37:26 +0200 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '1.5.1'
2015-04-07 17:37:26 +0200 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '1.4.1'
2015-04-07 17:37:26 +0200 [info]: gem 'fluent-plugin-s3' version '0.5.7'
2015-04-07 17:37:26 +0200 [info]: gem 'fluent-plugin-scribe' version '0.10.14'
2015-04-07 17:37:26 +0200 [info]: gem 'fluent-plugin-secure-forward' version '0.2.6'
2015-04-07 17:37:26 +0200 [info]: gem 'fluent-plugin-td' version '0.10.26'
2015-04-07 17:37:26 +0200 [info]: gem 'fluent-plugin-td-monitoring' version '0.2.0'
2015-04-07 17:37:26 +0200 [info]: gem 'fluent-plugin-webhdfs' version '0.4.1'
2015-04-07 17:37:26 +0200 [info]: gem 'fluentd' version '0.12.7'
2015-04-07 17:37:26 +0200 [info]: using configuration file: <ROOT>
  <source>
    type tail
    path /var/log/syslog
    pos_file /var/log/td-agent/syslog.pos
    format none
    tag syslog
  </source>
  <match reformered.**>
    type secure_forward
    self_hostname ${hostname}
    shared_key hogeposxxx0
    keepalive 300
    buffer_type memory
    buffer_chunk_limit 256m
    buffer_queue_limit 128
    flush_interval 5s
    disable_retry_limit true
    retry_wait 30s
    <server>
      host aggregator.toto.lan
      port 282525
    </server>
  </match>
  <match **>
    type record_reformer
    hostname ${hostname}
    tag reformered.${tag}
  </match>
</ROOT>
2015-04-07 17:37:26 +0200 [info]: adding match pattern="reformered.**" type="secure_forward"
2015-04-07 17:37:11 +0200 [info]: adding match pattern="**" type="record_reformer"
2015-04-07 17:37:12 +0200 [info]: adding source type="tail"
2015-04-07 17:37:12 +0200 [info]: following tail of /var/log/syslog
2015-04-07 17:37:17 +0200 [warn]: dead connection found: aggregator.toto.lan, reconnecting...
2015-04-07 17:37:18 +0200 [warn]: temporarily failed to flush the buffer. next_retry=2015-04-07 17:37:50 +0200 error_class="RuntimeError" error="no one nodes with valid ssl session" plugin_id="object:3b111da080d4"
  2015-04-07 17:37:18 +0200 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-secure-forward-0.2.6/lib/fluent/plugin/out_secure_forward.rb:216:in `write_objects'
  2015-04-07 17:37:18 +0200 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.7/lib/fluent/output.rb:463:in `write'
  2015-04-07 17:37:18 +0200 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.7/lib/fluent/buffer.rb:325:in `write_chunk'
  2015-04-07 17:37:18 +0200 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.7/lib/fluent/buffer.rb:304:in `pop'
  2015-04-07 17:37:18 +0200 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.7/lib/fluent/output.rb:321:in `try_flush'
  2015-04-07 17:37:18 +0200 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.7/lib/fluent/output.rb:140:in `run
2015-04-07 17:37:32 +0200 [warn]: dead connection found: aggregator.toto.lan, reconnecting...
2015-04-07 17:37:47 +0200 [warn]: dead connection found: aggregator.toto.lan, reconnecting...
2015-04-07 17:37:51 +0200 [warn]: temporarily failed to flush the buffer. next_retry=2015-04-07 17:38:53 +0200 error_class="RuntimeError" error="no one nodes with valid ssl session" plugin_id="object:3b111da080d4"
  2015-04-07 17:37:51 +0200 [warn]: suppressed same stacktrace
2015-04-07 17:38:02 +0200 [warn]: dead connection found: aggregator.toto.lan, reconnecting...
2015-04-07 17:38:17 +0200 [warn]: dead connection found: aggregator.toto.lan, reconnecting...
2015-04-07 17:38:32 +0200 [warn]: dead connection found: aggregator.toto.lan, reconnecting...
2015-04-07 17:38:47 +0200 [warn]: dead connection found: aggregator.toto.lan, reconnecting...
2015-04-07 17:38:54 +0200 [warn]: temporarily failed to flush the buffer. next_retry=2015-04-07 17:40:40 +0200 error_class="RuntimeError" error="no one nodes with valid ssl session" plugin_id="object:3b111da080d4"
  2015-04-07 17:38:54 +0200 [warn]: suppressed same stacktrace
2015-04-07 17:39:02 +0200 [warn]: dead connection found: aggregator.toto.lan, reconnecting...
2015-04-07 17:39:17 +0200 [warn]: dead connection found: aggregator.toto.lan, reconnecting...
2015-04-07 17:39:32 +0200 [warn]: dead connection found: aggregator.toto.lan, reconnecting...
2015-04-07 17:39:47 +0200 [warn]: dead connection found: aggregator.toto.lan, reconnecting...
2015-04-07 17:40:02 +0200 [warn]: dead connection found: aggregator.toto.lan, reconnecting...
2015-04-07 17:40:17 +0200 [warn]: dead connection found: aggregator.toto.lan, reconnecting...
2015-04-07 17:40:32 +0200 [warn]: dead connection found: aggregator.toto.lan, reconnecting...
2015-04-07 17:40:41 +0200 [warn]: temporarily failed to flush the buffer. next_retry=2015-04-07 17:44:55 +0200 error_class="RuntimeError" error="no one nodes with valid ssl session" plugin_id="object:3b111da080d4"

Aggregator :

2015-04-07 17:33:11 +0200 [info]: starting fluentd-0.12.7
2015-04-07 17:33:11 +0200 [info]: gem 'fluent-mixin-config-placeholders' version '0.3.0'
2015-04-07 17:33:11 +0200 [info]: gem 'fluent-mixin-plaintextformatter' version '0.2.6'
2015-04-07 17:33:11 +0200 [info]: gem 'fluent-mixin-rewrite-tag-name' version '0.1.0'
2015-04-07 17:33:11 +0200 [info]: gem 'fluent-plugin-elasticsearch' version '0.7.0'
2015-04-07 17:33:11 +0200 [info]: gem 'fluent-plugin-mongo' version '0.7.8'
2015-04-07 17:33:11 +0200 [info]: gem 'fluent-plugin-mysqlslowquerylog' version '0.0.2'
2015-04-07 17:33:11 +0200 [info]: gem 'fluent-plugin-parser' version '0.4.1'
2015-04-07 17:33:11 +0200 [info]: gem 'fluent-plugin-record-reformer' version '0.5.0'
2015-04-07 17:33:11 +0200 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '1.5.1'
2015-04-07 17:33:11 +0200 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '1.4.1'
2015-04-07 17:33:11 +0200 [info]: gem 'fluent-plugin-s3' version '0.5.7'
2015-04-07 17:33:11 +0200 [info]: gem 'fluent-plugin-scribe' version '0.10.14'
2015-04-07 17:33:11 +0200 [info]: gem 'fluent-plugin-secure-forward' version '0.2.6'
2015-04-07 17:33:11 +0200 [info]: gem 'fluent-plugin-td' version '0.10.26'
2015-04-07 17:33:11 +0200 [info]: gem 'fluent-plugin-td-monitoring' version '0.2.0'
2015-04-07 17:33:11 +0200 [info]: gem 'fluent-plugin-webhdfs' version '0.4.1'
2015-04-07 17:33:11 +0200 [info]: gem 'fluentd' version '0.12.7'
2015-04-07 17:33:11 +0200 [info]: using configuration file: <ROOT>
  <source>
    type secure_forward
    self_hostname ${hostname}
    shared_key hogeposxxx0
    cert_auto_generate yes
    port 282525
  </source>
  <match fluent.**>
    type record_reformer
    hostname ${hostname}
    tag reformered.${tag}
  </match>
  <match reformered.**>
    type record_reformer
    tag ${tag_suffix[1]}
  </match>
  <match **>
    type roundrobin
    <store>
      type elasticsearch
      host x.x.x.x
      port 9200
      logstash_format true
      logstash_prefix ${tag}
      include_tag_key true
      tag_key tag
      flush_interval 1s
    </store>
    <store>
      type elasticsearch
      host x.x.x.x
      port 9200
      logstash_format true
      logstash_prefix ${tag}
      include_tag_key true
      tag_key tag
      flush_interval 1s
    </store>
  </match>
</ROOT>

Can you help me ?

doul59 avatar Apr 07 '15 15:04 doul59

@doul59 Which td-agent version are you using? Does this problem occurs always (now and in past)? Or from any point of time?

tagomoris avatar Apr 07 '15 17:04 tagomoris

From your log, connection established to xxx is missing. If plugin works correctly, above message should be logged: https://github.com/tagomoris/fluent-plugin-secure-forward/blob/e48684ac5ce801e299289b61652b39191dad8d9b/lib/fluent/plugin/output_node.rb#L204

Can your forwarder connect to aggregator? Does this happen with only secure-forward plugin?

repeatedly avatar Apr 07 '15 19:04 repeatedly

I have try 2.1.4 and upgrade this afternoon in 2.2.0 if i try without secure it's ok ...

doul59 avatar Apr 07 '15 19:04 doul59

@doul59 do you mean that everything works on td-agent 2.1.4?

tagomoris avatar Apr 07 '15 19:04 tagomoris

no i have same problem with 2.1.4

doul59 avatar Apr 07 '15 20:04 doul59

Do you configure port 282525? Under IP protocol, max number of available port is 65534 ...

tagomoris avatar Apr 07 '15 20:04 tagomoris

yes it's a mistake on my copy/paste :-1: i use 28252 but if i don't specify port in config it's same result

doul59 avatar Apr 08 '15 07:04 doul59

on aggregator : netstat -anope

tcp 517 0 ip_aggreagator:28252 ip_forwarder:58854 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggreagator:28252 ip_forwarder:58829 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggreagator:28252 ip_forwarder:58831 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggreagator:28252 ip_forwarder:58851 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggreagator:28252 ip_forwarder:58841 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggreagator:28252 ip_forwarder:58855 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggreagator:28252 ip_forwarder:58850 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggreagator:28252 ip_forwarder:58852 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggreagator:28252 ip_forwarder:58835 ESTABLISHED 0 0 - off (0.00/0/0)

doul59 avatar Apr 08 '15 07:04 doul59

no i have same problem with 2.1.4

It means 2.1.4 doesn't have this issue? Could you see the differences between 2.1.4 and 2.2.0, like network state or something?

repeatedly avatar Apr 08 '15 07:04 repeatedly

i had exactly same issue yesterday with 2.1.4, i update in 2.2.0 but nothing change ... maybe version of ruby is a problem ?

dpkg -l | grep ruby ii ruby 1:1.9.3 all Interpreter of object-oriented scripting language Ruby (default version)

2015-04-08 09:07:25 +0200 [warn]: temporarily failed to flush the buffer. next_retry=2015-04-08 09:07:52 +0200 error_class="RuntimeError" error="no one nodes with valid ssl session" plugin_id="object:35f3a4dc7ecc" 2015-04-08 09:07:25 +0200 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-secure-forward-0.2.6/lib/fluent/plugin/out_secure_forward.rb:216:in write_objects' 2015-04-08 09:07:25 +0200 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.7/lib/fluent/output.rb:463:inwrite' 2015-04-08 09:07:25 +0200 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.7/lib/fluent/buffer.rb:325:in write_chunk' 2015-04-08 09:07:25 +0200 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.7/lib/fluent/buffer.rb:304:inpop' 2015-04-08 09:07:25 +0200 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.7/lib/fluent/output.rb:321:in try_flush' 2015-04-08 09:07:25 +0200 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.7/lib/fluent/output.rb:140:inrun'

doul59 avatar Apr 08 '15 07:04 doul59

Hmm.. secure-forward works with ruby 2.1.0 or later on some environment so I hope ruby version is not related.

repeatedly avatar Apr 08 '15 09:04 repeatedly

i try with an other server // same configuration Without secure-forward : it's ok With secure-forward same issue

doul59 avatar Apr 08 '15 13:04 doul59

'Without secure-forward' means you use normal forward plugin? You seem to use Ubuntu so could you check secure-forward plugin works or not with td-agent 1?

http://docs.fluentd.org/articles/install-by-deb-v1

repeatedly avatar Apr 08 '15 13:04 repeatedly

my servers is on debian wheezy

doul59 avatar Apr 08 '15 14:04 doul59

i have done a new test : Stop td-agent on forwarder + Stop td-agent on aggregator Start agent on aggregator wait 2 min Start agent on forwarder

forwarder send to aggregator few lines of log and after X minutes forwarder is broken

2015-04-08 17:20:29 +0200 [info]: following tail of /var/log/syslog 2015-04-08 17:20:30 +0200 [info]: connection established to aggregator.toto.lan 2015-04-08 17:21:31 +0200 [warn]: disconnected from aggregator.toto.lan 2015-04-08 17:21:34 +0200 [warn]: dead connection found: aggregator.toto.lan, reconnecting... 2015-04-08 17:21:39 +0200 [warn]: recovered connection to dead node: aggregator.toto.lan 2015-04-08 17:22:32 +0200 [warn]: disconnected from aggregator.toto.lan 2015-04-08 17:22:34 +0200 [warn]: dead connection found: aggregator.toto.lan, reconnecting... 2015-04-08 17:22:49 +0200 [warn]: dead connection found: aggregator.toto.lan, reconnecting... 2015-04-08 17:23:04 +0200 [warn]: dead connection found: aggregator.toto.lan, reconnecting... 2015-04-08 17:23:19 +0200 [warn]: dead connection found: aggregator.toto.lan, reconnecting... 2015-04-08 17:23:32 +0200 [warn]: temporarily failed to flush the buffer. next_retry=2015-04-08 17:23:32 +0200 error_class="RuntimeError" error="no one nodes with valid ssl session" plugin_id="object:35357e008488" 2015-04-08 17:23:32 +0200 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-secure-forward-0.2.6/lib/fluent/plugin/out_secure_forward.rb:216:in write_objects' 2015-04-08 17:23:32 +0200 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.7/lib/fluent/output.rb:463:inwrite' 2015-04-08 17:23:32 +0200 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.7/lib/fluent/buffer.rb:325:in write_chunk' 2015-04-08 17:23:32 +0200 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.7/lib/fluent/buffer.rb:304:inpop' 2015-04-08 17:23:32 +0200 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.7/lib/fluent/output.rb:321:in try_flush' 2015-04-08 17:23:32 +0200 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.7/lib/fluent/output.rb:140:inrun' 2015-04-08 17:23:33 +0200 [warn]: temporarily failed to flush the buffer. next_retry=2015-04-08 17:23:34 +0200 error_class="RuntimeError" error="no one nodes with valid ssl session" plugin_id="object:35357e008488"

on the aggregator :

tcp 517 0 ip_aggregator:28252 ip_forwarder:60614 ESTABLISHED 0 0 - off (0.00/0/0) tcp 38 0 ip_aggregator:28252 ip_forwarder:60613 CLOSE_WAIT 110 28593 6067/ruby off (0.00/0/0) tcp 517 0 ip_aggregator:28252 ip_forwarder:60622 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggregator:28252 ip_forwarder:60626 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggregator:28252 ip_forwarder:60630 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggregator:28252 ip_forwarder:60628 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggregator:28252 ip_forwarder:60627 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggregator:28252 ip_forwarder:60635 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggregator:28252 ip_forwarder:60617 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggregator:28252 ip_forwarder:60629 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggregator:28252 ip_forwarder:60633 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggregator:28252 ip_forwarder:60620 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggregator:28252 ip_forwarder:60634 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggregator:28252 ip_forwarder:60631 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggregator:28252 ip_forwarder:60632 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggregator:28252 ip_forwarder:60619 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggregator:28252 ip_forwarder:60615 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggregator:28252 ip_forwarder:60625 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggregator:28252 ip_forwarder:60621 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggregator:28252 ip_forwarder:60618 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggregator:28252 ip_forwarder:60616 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggregator:28252 ip_forwarder:60624 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggregator:28252 ip_forwarder:60623 ESTABLISHED 0 0 - off (0.00/0/0) tcp 517 0 ip_aggregator:28252 ip_forwarder:60636 ESTABLISHED 0 0 - off (0.00/0/0)

doul59 avatar Apr 08 '15 15:04 doul59

in debug mode :

aggregator : 2015-04-08 17:37:09 +0200 [info]: fluent/agent.rb:123:add_match: adding match pattern="fluent." type="record_reformer" 2015-04-08 17:37:09 +0200 [trace]: fluent/plugin.rb:98:register_impl: registered output plugin 'record_reformer' 2015-04-08 17:37:09 +0200 [info]: fluent/agent.rb:123:add_match: adding match pattern="reformered." type="record_reformer" 2015-04-08 17:37:09 +0200 [info]: fluent/agent.rb:123:add_match: adding match pattern="**" type="roundrobin" 2015-04-08 17:37:09 +0200 [debug]: plugin/out_roundrobin.rb:44:block in configure: adding store type="elasticsearch", weight=1 2015-04-08 17:37:09 +0200 [debug]: plugin/out_roundrobin.rb:44:block in configure: adding store type="elasticsearch", weight=1 2015-04-08 17:37:09 +0200 [info]: fluent/root_agent.rb:142:add_source: adding source type="secure_forward" 2015-04-08 17:37:09 +0200 [trace]: fluent/plugin.rb:98:register_impl: registered input plugin 'secure_forward' 2015-04-08 17:37:09 +0200 [trace]: plugin/in_secure_forward.rb:179:run: setup for ssl sessions 2015-04-08 17:37:09 +0200 [trace]: plugin/in_secure_forward.rb:185:run: start to listen bind="0.0.0.0" port=28252 2015-04-08 17:37:09 +0200 [trace]: plugin/in_secure_forward.rb:187:run: starting SSL server bind="0.0.0.0" port=28252 2015-04-08 17:37:09 +0200 [trace]: plugin/in_secure_forward.rb:191:run: accepting sessions 2015-04-08 17:38:14 +0200 [trace]: plugin/in_secure_forward.rb:194:block in run: accept tcp connection (ssl session not established yet) 2015-04-08 17:38:14 +0200 [debug]: plugin/input_session.rb:148:start: starting server 2015-04-08 17:38:14 +0200 [trace]: plugin/in_secure_forward.rb:199:block in run: session instances: all=1 closed=0 2015-04-08 17:38:14 +0200 [trace]: plugin/input_session.rb:150:start: accepting ssl session 2015-04-08 17:38:14 +0200 [debug]: plugin/input_session.rb:63:generate_helo: generating helo 2015-04-08 17:38:15 +0200 [debug]: plugin/input_session.rb:122:on_read: on_read 2015-04-08 17:38:15 +0200 [debug]: plugin/input_session.rb:69:check_ping: checking ping 2015-04-08 17:38:15 +0200 [debug]: plugin/input_session.rb:105:generate_pong: generating pong 2015-04-08 17:38:15 +0200 [debug]: plugin/input_session.rb:137:on_read: connection established 2015-04-08 17:39:15 +0200 [debug]: plugin/input_session.rb:122:on_read: on_read 2015-04-08 17:39:16 +0200 [debug]: plugin/input_session.rb:122:on_read: on_read 2015-04-08 17:39:19 +0200 [trace]: plugin/in_secure_forward.rb:194:block in run: accept tcp connection (ssl session not established yet) 2015-04-08 17:39:19 +0200 [debug]: plugin/input_session.rb:148:start: starting server 2015-04-08 17:39:19 +0200 [trace]: plugin/in_secure_forward.rb:199:block in run: session instances: all=1 closed=0 2015-04-08 17:39:19 +0200 [trace]: plugin/input_session.rb:150:start: accepting ssl session 2015-04-08 17:39:19 +0200 [debug]: plugin/input_session.rb:63:generate_helo: generating helo 2015-04-08 17:39:20 +0200 [debug]: plugin/input_session.rb:122:on_read: on_read 2015-04-08 17:39:20 +0200 [debug]: plugin/input_session.rb:69:check_ping: checking ping 2015-04-08 17:39:20 +0200 [debug]: plugin/input_session.rb:105:generate_pong: generating pong 2015-04-08 17:39:20 +0200 [debug]: plugin/input_session.rb:137:on_read: connection established 2015-04-08 17:39:21 +0200 [info]: plugin/out_elasticsearch.rb:67:client: Connection opened to Elasticsearch cluster => {:host=>"ip-es1", :port=>9200, :scheme=>"http"} 2015-04-08 17:39:21 +0200 [info]: plugin/out_elasticsearch.rb:67:client: Connection opened to Elasticsearch cluster => {:host=>"ip-es2", :port=>9200, :scheme=>"http"}

forwarder : 2015-04-08 17:38:14 +0200 [info]: fluent/agent.rb:123:add_match: adding match pattern="reformered." type="secure_forward" 2015-04-08 17:38:14 +0200 [trace]: fluent/plugin.rb:98:register_impl: registered output plugin 'secure_forward' 2015-04-08 17:38:14 +0200 [info]: fluent/agent.rb:123:add_match: adding match pattern="" type="record_reformer" 2015-04-08 17:38:14 +0200 [trace]: fluent/plugin.rb:98:register_impl: registered output plugin 'record_reformer' 2015-04-08 17:38:14 +0200 [info]: fluent/root_agent.rb:142:add_source: adding source type="tail" 2015-04-08 17:38:14 +0200 [debug]: plugin/out_secure_forward.rb:121:start: starting secure-forward 2015-04-08 17:38:14 +0200 [debug]: plugin/out_secure_forward.rb:123:start: start to connect target nodes 2015-04-08 17:38:14 +0200 [debug]: plugin/out_secure_forward.rb:125:block in start: connecting node host="aggregator.toto.lan" port=28252 2015-04-08 17:38:14 +0200 [info]: plugin/in_tail.rb:477:initialize: following tail of /var/log/syslog 2015-04-08 17:38:14 +0200 [debug]: plugin/output_node.rb:212:connect: starting client 2015-04-08 17:38:14 +0200 [debug]: plugin/output_node.rb:215:connect: create tcp socket to node host="aggregator.toto.lan" address="192.168.18.81" port=28252 2015-04-08 17:38:14 +0200 [trace]: plugin/output_node.rb:224:connect: changing socket options 2015-04-08 17:38:14 +0200 [trace]: plugin/output_node.rb:232:connect: initializing SSL contexts 2015-04-08 17:38:14 +0200 [debug]: plugin/output_node.rb:237:connect: trying to connect ssl session host="aggregator.toto.lan" address="192.168.18.81" port=28252 2015-04-08 17:38:14 +0200 [debug]: plugin/output_node.rb:249:connect: ssl session connected host="aggregator.toto.lan" port=28252 2015-04-08 17:38:14 +0200 [debug]: plugin/output_node.rb:269:connect: ssl sessison connected host="aggregator.toto.lan" port=28252 2015-04-08 17:38:14 +0200 [debug]: plugin/output_node.rb:181:on_read: on_read 2015-04-08 17:38:14 +0200 [debug]: plugin/output_node.rb:129:check_helo: checking helo 2015-04-08 17:38:14 +0200 [debug]: plugin/output_node.rb:141:generate_ping: generating ping 2015-04-08 17:38:15 +0200 [debug]: plugin/output_node.rb:181:on_read: on_read 2015-04-08 17:38:15 +0200 [debug]: plugin/output_node.rb:156:check_pong: checking pong 2015-04-08 17:38:15 +0200 [info]: plugin/output_node.rb:204:on_read: connection established to aggregator.toto.lan 2015-04-08 17:38:15 +0200 [debug]: plugin/output_node.rb:207:on_read: connection established host="aggregator.toto.lan" port=28252 expire=nil 2015-04-08 17:38:19 +0200 [trace]: plugin/out_secure_forward.rb:139:block in node_watcher: in node health watcher 2015-04-08 17:38:19 +0200 [trace]: plugin/out_secure_forward.rb:142:block (2 levels) in node_watcher: node health watcher for aggregator.toto.lan 2015-04-08 17:38:24 +0200 [trace]: plugin/out_secure_forward.rb:139:block in node_watcher: in node health watcher 2015-04-08 17:38:24 +0200 [trace]: plugin/out_secure_forward.rb:142:block (2 levels) in node_watcher: node health watcher for aggregator.toto.lan 2015-04-08 17:38:29 +0200 [trace]: plugin/out_secure_forward.rb:139:block in node_watcher: in node health watcher 2015-04-08 17:38:29 +0200 [trace]: plugin/out_secure_forward.rb:142:block (2 levels) in node_watcher: node health watcher for aggregator.toto.lan 2015-04-08 17:38:34 +0200 [trace]: plugin/out_secure_forward.rb:139:block in node_watcher: in node health watcher 2015-04-08 17:38:34 +0200 [trace]: plugin/out_secure_forward.rb:142:block (2 levels) in node_watcher: node health watcher for aggregator.toto.lan 2015-04-08 17:38:39 +0200 [trace]: plugin/out_secure_forward.rb:139:block in node_watcher: in node health watcher 2015-04-08 17:38:39 +0200 [trace]: plugin/out_secure_forward.rb:142:block (2 levels) in node_watcher: node health watcher for aggregator.toto.lan 2015-04-08 17:38:44 +0200 [trace]: plugin/out_secure_forward.rb:139:block in node_watcher: in node health watcher 2015-04-08 17:38:44 +0200 [trace]: plugin/out_secure_forward.rb:142:block (2 levels) in node_watcher: node health watcher for aggregator.toto.lan 2015-04-08 17:38:49 +0200 [trace]: plugin/out_secure_forward.rb:139:block in node_watcher: in node health watcher 2015-04-08 17:38:49 +0200 [trace]: plugin/out_secure_forward.rb:142:block (2 levels) in node_watcher: node health watcher for aggregator.toto.lan 2015-04-08 17:38:54 +0200 [trace]: plugin/out_secure_forward.rb:139:block in node_watcher: in node health watcher 2015-04-08 17:38:54 +0200 [trace]: plugin/out_secure_forward.rb:142:block (2 levels) in node_watcher: node health watcher for aggregator.toto.lan 2015-04-08 17:38:59 +0200 [trace]: plugin/out_secure_forward.rb:139:block in node_watcher: in node health watcher 2015-04-08 17:38:59 +0200 [trace]: plugin/out_secure_forward.rb:142:block (2 levels) in node_watcher: node health watcher for aggregator.toto.lan 2015-04-08 17:39:04 +0200 [trace]: plugin/out_secure_forward.rb:139:block in node_watcher: in node health watcher 2015-04-08 17:39:04 +0200 [trace]: plugin/out_secure_forward.rb:142:block (2 levels) in node_watcher: node health watcher for aggregator.toto.lan 2015-04-08 17:39:09 +0200 [trace]: plugin/out_secure_forward.rb:139:block in node_watcher: in node health watcher 2015-04-08 17:39:09 +0200 [trace]: plugin/out_secure_forward.rb:142:block (2 levels) in node_watcher: node health watcher for aggregator.toto.lan 2015-04-08 17:39:14 +0200 [trace]: plugin/out_secure_forward.rb:139:block in node_watcher: in node health watcher 2015-04-08 17:39:14 +0200 [trace]: plugin/out_secure_forward.rb:142:block (2 levels) in node_watcher: node health watcher for aggregator.toto.lan 2015-04-08 17:39:15 +0200 [trace]: plugin/out_secure_forward.rb:218:write_objects: selected node host="aggregator.toto.lan" port=28252 standby=false 2015-04-08 17:39:16 +0200 [trace]: plugin/out_secure_forward.rb:218:write_objects: selected node host="aggregator.toto.lan" port=28252 standby=false 2015-04-08 17:39:16 +0200 [warn]: plugin/output_node.rb:294:rescue in block in connect: disconnected from aggregator.toto.lan 2015-04-08 17:39:16 +0200 [debug]: plugin/output_node.rb:89:shutdown: shutting down node aggregator.toto.lan 2015-04-08 17:39:17 +0200 [warn]: fluent/output.rb:354:rescue in try_flush: temporarily failed to flush the buffer. next_retry=2015-04-08 17:39:18 +0200 error_class="RuntimeError" error="no one nodes with valid ssl session" plugin_id="object:335bbe846424" 2015-04-08 17:39:17 +0200 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-secure-forward-0.2.6/lib/fluent/plugin/out_secure_forward.rb:216:in write_objects' 2015-04-08 17:39:17 +0200 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.7/lib/fluent/output.rb:463:inwrite' 2015-04-08 17:39:17 +0200 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.7/lib/fluent/buffer.rb:325:in write_chunk' 2015-04-08 17:39:17 +0200 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.7/lib/fluent/buffer.rb:304:inpop' 2015-04-08 17:39:17 +0200 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.7/lib/fluent/output.rb:321:in try_flush' 2015-04-08 17:39:17 +0200 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.7/lib/fluent/output.rb:140:inrun' 2015-04-08 17:39:19 +0200 [warn]: fluent/output.rb:354:rescue in try_flush: temporarily failed to flush the buffer. next_retry=2015-04-08 17:39:19 +0200 error_class="RuntimeError" error="no one nodes with valid ssl session" plugin_id="object:335bbe846424" 2015-04-08 17:39:19 +0200 [warn]: suppressed same stacktrace 2015-04-08 17:39:19 +0200 [trace]: plugin/out_secure_forward.rb:139:block in node_watcher: in node health watcher 2015-04-08 17:39:19 +0200 [trace]: plugin/out_secure_forward.rb:142:block (2 levels) in node_watcher: node health watcher for aggregator.toto.lan 2015-04-08 17:39:19 +0200 [warn]: plugin/out_secure_forward.rb:151:block (2 levels) in node_watcher: dead connection found: aggregator.toto.lan, reconnecting... 2015-04-08 17:39:19 +0200 [debug]: plugin/out_secure_forward.rb:156:block (2 levels) in node_watcher: reconnecting to node host="aggregator.toto.lan" port=28252 expire=nil expired=false detached=false 2015-04-08 17:39:19 +0200 [debug]: plugin/output_node.rb:212:connect: starting client 2015-04-08 17:39:19 +0200 [trace]: plugin/out_secure_forward.rb:168:block (2 levels) in node_watcher: checking reconnecting node aggregator.toto.lan 2015-04-08 17:39:19 +0200 [debug]: plugin/output_node.rb:215:connect: create tcp socket to node host="aggregator.toto.lan" address="192.168.18.81" port=28252 2015-04-08 17:39:19 +0200 [trace]: plugin/output_node.rb:224:connect: changing socket options 2015-04-08 17:39:19 +0200 [trace]: plugin/output_node.rb:232:connect: initializing SSL contexts 2015-04-08 17:39:19 +0200 [debug]: plugin/output_node.rb:237:connect: trying to connect ssl session host="aggregator.toto.lan" address="192.168.18.81" port=28252 2015-04-08 17:39:19 +0200 [debug]: plugin/output_node.rb:249:connect: ssl session connected host="aggregator.toto.lan" port=28252 2015-04-08 17:39:19 +0200 [debug]: plugin/output_node.rb:269:connect: ssl sessison connected host="aggregator.toto.lan" port=28252 2015-04-08 17:39:19 +0200 [debug]: plugin/output_node.rb:181:on_read: on_read 2015-04-08 17:39:19 +0200 [debug]: plugin/output_node.rb:129:check_helo: checking helo 2015-04-08 17:39:19 +0200 [debug]: plugin/output_node.rb:141:generate_ping: generating ping 2015-04-08 17:39:20 +0200 [debug]: plugin/output_node.rb:181:on_read: on_read 2015-04-08 17:39:20 +0200 [debug]: plugin/output_node.rb:156:check_pong: checking pong 2015-04-08 17:39:20 +0200 [debug]: plugin/output_node.rb:207:on_read: connection established host="aggregator.toto.lan" port=28252 expire=nil 2015-04-08 17:39:20 +0200 [warn]: fluent/output.rb:354:rescue in try_flush: temporarily failed to flush the buffer. next_retry=2015-04-08 17:39:24 +0200 error_class="RuntimeError" error="no one nodes with valid ssl session" plugin_id="object:335bbe846424"

doul59 avatar Apr 08 '15 15:04 doul59

How about to configure forwarder without keepalive 300? (it means keep connection alive forever.)

tagomoris avatar Apr 08 '15 17:04 tagomoris

Hmm... I tried Debian 7 with td-agent 2.2.0 but there is no problem in localhost test.

tcp        0      0 127.0.0.1:28252         127.0.0.1:50370         ESTABLISHED 0          27790       -                off (0.00/0/0)
tcp        0      0 127.0.0.1:50370         127.0.0.1:28252         ESTABLISHED 0          27789       -                off (0.00/0/0)

From your netstat, there are lots of 517 Recv-Q and 0 inode connections. This is weird...

tcp 38 0 ip_aggregator:28252 ip_forwarder:60613 CLOSE_WAIT 110 28593 6067/ruby off (0.00/0/0)

What is this program? 6067/ruby is your test program?

repeatedly avatar Apr 08 '15 18:04 repeatedly

without keepalive : same issue

in this exemple : 6067/ruby is td-agent process

doul59 avatar Apr 09 '15 07:04 doul59

i had reinstall all plateform // i have same issue without secure-forward :/

doul59 avatar Apr 09 '15 16:04 doul59

Same issue without secure-forward? It's weird, and it seems not about secure-forward, but about network quality of your system or so...

tagomoris avatar Apr 09 '15 17:04 tagomoris

(i used fluent 2.1.4 since 2 months without issue) i had upgrade in 2.2.0 and some issue :/

i try reinstall in 2.1.4 ;)

doul59 avatar Apr 10 '15 09:04 doul59

same result with 2.1.4 i don't understand ... i have try with new VM and reinstall OS + fluentd and i have always this issue forwarder can't stay connected with aggregator (not a network problem ... same network tcpdump seems ok)

doul59 avatar Apr 10 '15 14:04 doul59

I have the same issue. How do we fix this or is there an work around?

jo-bi avatar Oct 14 '15 13:10 jo-bi

I released experimental version of this plugin, v0.3.5pre1. It may improve your problems if you're specifying keepalive timeout option. With this version, secure-forward output plugin will disconnect stuck connections forcedly after keepalive timeout (in default, connection_hard_timeout is 1.2x of keepalive).

If you have troubles on stuck connections with keepalive, please try it, with just few nodes at first :P

tagomoris avatar Mar 08 '16 19:03 tagomoris

Hi

I get a similar issue to this. A connection fails to be established even though the logs say the retry succeeded nothing makes it to the aggregator. Any insight as to what could be causing this issue?

2017-07-19 11:10:45 +0000 [info]: connection established to logger.contacta.io 2017-07-19 11:11:45 +0000 [warn]: disconnected host="logger.contacta.io" port=24284 2017-07-19 11:11:45 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2017-07-19 11:11:46 +0000 error_class="RuntimeError" error="no one nodes with valid ssl session" plugin_id="object:3faae7565348" 2017-07-19 11:11:45 +0000 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-secure-forward-0.4.5/lib/fluent/plugin/out_secure_forward.rb:259:in write_objects' 2017-07-19 11:11:45 +0000 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/output.rb:490:in write' 2017-07-19 11:11:45 +0000 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/buffer.rb:354:in write_chunk' 2017-07-19 11:11:45 +0000 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/buffer.rb:333:in pop' 2017-07-19 11:11:45 +0000 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/output.rb:342:in try_flush' 2017-07-19 11:11:45 +0000 [warn]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.35/lib/fluent/output.rb:149:in run' 2017-07-19 11:11:46 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2017-07-19 11:11:49 +0000 error_class="RuntimeError" error="no one nodes with valid ssl session" plugin_id="object:3faae7565348" 2017-07-19 11:11:46 +0000 [warn]: suppressed same stacktrace 2017-07-19 11:11:49 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2017-07-19 11:11:52 +0000 error_class="RuntimeError" error="no one nodes with valid ssl session" plugin_id="object:3faae7565348" 2017-07-19 11:11:49 +0000 [warn]: suppressed same stacktrace 2017-07-19 11:11:49 +0000 [warn]: dead connection found: logger.contacta.io, reconnecting... 2017-07-19 11:11:52 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2017-07-19 11:12:01 +0000 error_class="RuntimeError" error="no one nodes with valid ssl session" plugin_id="object:3faae7565348" 2017-07-19 11:11:52 +0000 [warn]: suppressed same stacktrace 2017-07-19 11:11:54 +0000 [warn]: recovered connection to dead node: logger.contacta.io 2017-07-19 11:12:01 +0000 [warn]: retry succeeded. plugin_id="object:3faae7565348" 2017-07-19 11:12:02 +0000 [warn]: disconnected host="logger.contacta.io" port=24284 2017-07-19 11:12:02 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2017-07-19 11:12:03 +0000 error_class="RuntimeError" error="no one nodes with valid ssl session" plugin_id="object:3faae7565348" 2017-07-19 11:12:02 +0000 [warn]: suppressed same stacktrace 2017-07-19 11:12:03 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2017-07-19 11:12:05 +0000 error_class="RuntimeError" error="no one nodes with valid ssl session" plugin_id="object:3faae7565348" 2017-07-19 11:12:03 +0000 [warn]: suppressed same stacktrace 2017-07-19 11:12:04 +0000 [warn]: dead connection found: logger.contacta.io, reconnecting... 2017-07-19 11:12:05 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2017-07-19 11:12:09 +0000 error_class="RuntimeError" error="no one nodes with valid ssl session" plugin_id="object:3faae7565348" 2017-07-19 11:12:05 +0000 [warn]: suppressed same stacktrace 2017-07-19 11:12:09 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2017-07-19 11:12:17 +0000 error_class="RuntimeError" error="no one nodes with valid ssl session" plugin_id="object:3faae7565348" 2017-07-19 11:12:09 +0000 [warn]: suppressed same stacktrace

  • Alex

alexilievski avatar Jul 19 '17 11:07 alexilievski