fluentd icon indicating copy to clipboard operation
fluentd copied to clipboard

out_http: Add option to reuse connections

Open Garfield96 opened this issue 2 years ago • 7 comments

Which issue(s) this PR fixes: N/A

What this PR does / why we need it: The HTTP output plugin recreates a connection for every HTTP request. In combination with TLS and high network latencies, this can result in very poor performance. This PR adds an option to reuse connections.

The implementation caches one connection per flush thread. The cache is part of the HTTP plugin instance and is implemented as an array. When a thread creates its first request, it gets assigned a slot (id) in this array and the id is stored in a thread local variable. Since this slot is exclusively used by a single thread, no synchronization is required. If the endpoint changes at runtime, the old connection will be closed and replaced by a new connection to the new endpoint. During shutdown, all open connections are finished in the close method.

This implementation works very well for a static endpoint. If the endpoint changes frequently, there will be little benefit. However, it can be assumed that most users use a static endpoint.

Benchmark I tested the throughput with two Fluentd instances on a single machine. One instance acting as sender and the other as receiver. Even though the connection creation was very fast, because both instances were located on the same machine and TLS was not enabled, activating connection reuse doubled the throughput.

Sender configuration:

<source>
  @type sample
  sample {"hello":"world"}
  tag sample
  rate 5000
</source>

<match **>
  @type http

  endpoint http://localhost:9000
  open_timeout 2
  reuse_connections true
  <format>
    @type json
  </format>
  <buffer>
    flush_interval 10s
    chunk_limit_records 1
    flush_thread_count 32
  </buffer>
</match>

Receiver configuration:

<system>
  workers 1
</system>

<source>
  @type http
  port 9000
  bind 0.0.0.0
  body_size_limit 32m
  keepalive_timeout 10s
</source>

<match **>
  @type flowcounter_simple
  unit second
</match>

Docs Changes:

Release Note: out_http: Add option to reuse connections

Garfield96 avatar Oct 22 '23 12:10 Garfield96

Hi @daipom, are you fine with this change? If no bigger changes are required, I'll add some tests.

Garfield96 avatar Oct 30 '23 16:10 Garfield96

Hi @daipom @ashie , I added a test case which shows that the feature works for the general case as well as for interrupted connections. Do you have an estimate when this feature can be released?

Garfield96 avatar Nov 17 '23 16:11 Garfield96

Thanks for this enhancement. Sorry for my late response. I was busy releasing v1.16.3 :cry:, but it was finally released on 11/14!

Now, I'm working on releasing fluent-package v5.0.2, which contains Fluentd v1.16.3. It will be released at the end of this month.

Since this adds the new option, it will be released on v1.17.0 (fluent-package v5.1.0). That date is not yet fixed, I think, but probably be early next year. I'm sorry for not seeing the details yet, but I will see this for v1.17.0 this month or next month.

daipom avatar Nov 18 '23 01:11 daipom

@ashie @daipom Can you please review this PR as well as https://github.com/fluent/fluentd/pull/4331? Thanks

Garfield96 avatar Jan 13 '24 21:01 Garfield96

Sorry for my late response. I'm seeing this.

It does not affect the existing specifications. It looks basically good.

However, there are a few points that concern me. I will comment on them later.

daipom avatar Jan 15 '24 08:01 daipom

Hi @daipom,

Thanks for the review. I agree that using a hash makes the code simpler. Unfortunately, hashes are not thread-safe in Ruby and inserting the CacheEntries could result in data-races. Therefore, we would again need a mutex. The array solution had the advantage, that the mutex was only used during the first request of a thread to retrieve the index. Afterwards, the code was lock-free.

Garfield96 avatar Feb 04 '24 11:02 Garfield96

@Garfield96 Thanks for considering!

Unfortunately, hashes are not thread-safe in Ruby and inserting the CacheEntries could result in data-races.

I think it's basically thread-safe, although some of the features are not thread-safe (e.g. default_proc (https://bugs.ruby-lang.org/issues/19237)).

||= in my suggestion seems to be not thread-safe, but Thread.current.name is a unique key to the thread.

So, I don't expect my suggestion to cause data races. Have you actually checked the data race?

daipom avatar Feb 05 '24 01:02 daipom

Rebased to the latest master, and add my DCO.

daipom avatar Apr 30 '24 03:04 daipom

I have added only the following additional fixes.

  • Move the cache struct definition outside of the constructor.
  • Make some parameter names more specific.

I have not added commits about the logic simplification that I'm suggesting because we don't have a consensus in the discussion about data race concerns (https://github.com/fluent/fluentd/pull/4330#issuecomment-1926072224).

daipom avatar Apr 30 '24 03:04 daipom

Thanks!

ashie avatar Apr 30 '24 05:04 ashie

Thanks so much @Garfield96 ! Sorry for taking a long time. :cry:

daipom avatar Apr 30 '24 05:04 daipom