rtb icon indicating copy to clipboard operation
rtb copied to clipboard

Connextions drops due to error

Open amitchauhan1502 opened this issue 5 years ago • 8 comments

Hi I am doing eJabberd load testing using rtb . In almost all test runs connections drop due to following error message : [debug] FORMAT ERROR: "~s terminated: ~s; reconnecting..." [[{...},35,"****"],<<"Connection failed: connection timed out">>]

"Rate of connections drop"/"new connection establishment" depends upon frequency of occurrence of the above error. When error frequency is high, new connections establishment rate is low and vice versa

I am using following configuration parameters scenario: xmpp interval: 100 capacity: 20000 starttls: false proxy65_interval: false debug: true roster: false csi: false mam: false reconnect_interval: 5

Please suggest how to eliminate/minimize the above error.

amitchauhan1502 avatar Jun 02 '20 07:06 amitchauhan1502

Raise interval:, it's useless to set it at 100ms if your server hardware/network can't take it.

What's the server load when this error starts to happen? CPU, I/O, RAM

licaon-kter avatar Jun 02 '20 08:06 licaon-kter

Thanks for your suggestion. Both CPU and memory utilization was under 15% when the error starts to appear. However I haven't captured screenshot. Also I will test with raised interval: values for furthers tests and report if the error message appears again.

amitchauhan1502 avatar Jun 02 '20 08:06 amitchauhan1502

image

Attached is the system resource utilization while load test in progress for 20k users. Both CPU and memory utilization within 6-7% an still I am getting frequent error message:

2020-06-03 07:43:16.154 [debug] <0.1809.0>@mod_xmpp:handle_stream_end:400 FORMAT ERROR: "~s terminated: ~s; reconnecting..." [[{,,,},35,"**"],<<"Connection failed: timeout">>]

My rtb yml file config is :- scenario: xmpp interval: 500 capacity: 20000

jid: user7_%@IP password: pass7_%

servers:

  • tcp://IP:5222

stats_dir: /root/rtb/stats www_dir: /root/rtb/www www_port: 80 gnuplot: /usr/bin/gnuplot

muc_rooms:

  • test_nic[1..2000]@conference.IP

starttls: false proxy65_interval: false debug: true roster: false csi: false mam: false reconnect_interval: 5

amitchauhan1502 avatar Jun 03 '20 08:06 amitchauhan1502

Below is the report after about 30 mins of execution. Connections drop due to the frequent timeout error message shared in previous comment

image

amitchauhan1502 avatar Jun 03 '20 08:06 amitchauhan1502

ejabberd.log says what? system logs say what?

licaon-kter avatar Jun 03 '20 08:06 licaon-kter

Hi I couldn't find any problem with ejabberd.log or system logs. Only major problem is repetitive error message with rtb logs which results in increased connection drop rate:

[debug] FORMAT ERROR: "~s terminated: ~s; reconnecting..." [[{...},35,"****"],<<"Connection failed: timeout">>]

Also sometimes error frequency is high and sometimes its low with same configuration and same case with the connection rate/number of connections

amitchauhan1502 avatar Jun 03 '20 11:06 amitchauhan1502

You have loglevel: 4 but no errors in .log?

licaon-kter avatar Jun 03 '20 12:06 licaon-kter

Hi, Only significant error message appearing in ejabberd.log are of 2 types:

Type1: 2020-06-03 12:49:37.018 [warning] <0.139.0> lager_error_logger_h dropped 415 messages in the last second that exceeded the limit of 100 messages/sec 2020-06-03 12:49:38.000 [warning] <0.139.0> lager_error_logger_h dropped 212 messages in the last second that exceeded the limit of 100 messages/sec 2020-06-03 12:49:39.005 [warning] <0.139.0> lager_error_logger_h dropped 392 messages in the last second that exceeded the limit of 100 messages/sec

Type 2: 2020-06-03 12:50:19.004 [info] <0.23913.3>@mod_stream_mgmt:transition_to_pending:451 (tcp|<0.23913.3>) Closing c2s connection for user7_9671@IP/rtb: Connection failed: connection closed; waiting 300 seconds for stream resumption

2020-06-03 12:53:18.017 [info] <0.834.4>@mod_stream_mgmt:transition_to_pending:451 (tcp|<0.834.4>) Closing c2s connection for user7_15150@IP/rtb: Connection failed: connection closed; waiting 300 seconds for stream resumption

2020-06-03 12:53:20.001 [info] <0.23842.3>@mod_stream_mgmt:transition_to_pending:451 (tcp|<0.23842.3>) Closing c2s connection for user7_18438@IP/rtb: Connection failed: connection closed; waiting 300 seconds for stream resumption

However frequency of such messages significantly lower than rtb log error message: [debug] FORMAT ERROR: "~s terminated: ~s; reconnecting..." [[{...},35,"****"],<<"Connection failed: timeout">>]

Please let me know if any more inputs required

amitchauhan1502 avatar Jun 03 '20 13:06 amitchauhan1502