http-lwt-client icon indicating copy to clipboard operation
http-lwt-client copied to clipboard

add keep_alive

Open craff opened this issue 1 year ago • 6 comments

I needed to able to send many request, and I wanted to keep the connection alive. I did this modification, and now I submit this PR. We can discuss on it if you do not like the solution I adopted.

As it is driven by an optional argument, it should be mostly compatible with existing code.

I did not clean the commit yet, there are some diff due to reindentation of the code by my editor. If you agree with the design, I will clean the code to have a smaller commit.

craff avatar Apr 22 '24 04:04 craff

Thanks for your contribution. I won't be able to review and comment this week on it, since I'm traveling and busy with other things.

hannesm avatar Apr 24 '24 15:04 hannesm

Dear @craff, sorry for my late reply. I think that keep-alive adds a lot of complexity to http-lwt-client.

I struggle a bit with the design - why are the new_keep_alive / active_keep_alive / reset_keep_alive exposed to a API client?

From my perspective, the API client should only bother with "use keep-alive" -- but that opens a can of worms, namely: should the request accept an "existing connection" parameter, and also expose "connection"? If that is the case, clearly some "close" needs to be performed or exposed.

TL;DR: I'm not sure about whether this complexity is worth it in this package -- maybe @reynir or @dinosaure have a different opinion?

hannesm avatar Aug 28 '24 08:08 hannesm

Hello Hannes,

I did not find any http client that is both minimalistic (i.e. not a library with offer both client and server) that support both HTTP 1.1 and HTTP 2 and allow to keep the connection alive.

Keeping the connection alive is important when you are working with an API that requires several small requests, especially over SSL.

My current design is

  • not very natural
  • and broken with HTTP/2 (I discovered that this week, and I plan to inverstigate).

But it keeps the compatibility with the interface with an optional parameter holding the past connection.

Clearly a simpler design is to separate connection from sending requests, and in HTTP/2 we could send requests interleaved which would be nice.

I really need something, my application send more than 10000 requests in 10mn to synchronise a moodle server with an ENT based on ONE/NEO by edifice. It would be more than 10 times slower if I did not keep the connection open (my first version was like this and was indeed too slow).

I struggle a bit with the design - why are the new_keep_alive / active_keep_alive / reset_keep_alive exposed to a API client?

This is used only to implement a form of retry. Very often the other side cut the connection (too long inactivity, max requests per connections, network failure). The alternative would be one more parameter: max_retry_reconnect ? and push the retry code in the library.

Cheers, Christophe

Hannes Mehnert @.***> writes:

Dear @craff, sorry for my late reply. I think that keep-alive adds a lot of complexity to http-lwt-client.

From my perspective, the API client should only bother with "use keep-alive" -- but that opens a can of worms, namely: should the request accept an "existing connection" parameter, and also expose "connection"? If that is the case, clearly some "close" needs to be performed or exposed.

TL;DR: I'm not sure about whether this complexity is worth it in this package -- maybe @reynir or @dinosaure have a different opinion?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

-- Christophe Raffalli web: https://raffalli.eu

craff avatar Aug 28 '24 18:08 craff

Reynir Björnsson @.***> writes:

@reynir commented on this pull request.

I'm not familiar with the keep-alive header or how it's useful. I note that in the MDN documentation (https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Keep-Alive) it says the keep-alive header is not allowed in HTTP/2 (and HTTP/3).

HTTP/2 and HTTP/3 implicitely means that the connection can be reused. So no need to say it. Http/2 and Http/3 actually allow request in parallel via multeplixing. Furthermore, Http/3 remove the TCP mechanism and to its own acknowledment via UDP, separate for each request. Therefore, it you download to files, and there is a checksum error in one, you only ask (parts) of one of the two files.

Also, as I recall, this library doesn't attempt to reuse connections, and there's no connection pool mechanism or similar. So is this mostly useful if the server does a redirect? And does this code take into account that the redirect may be to a different server?

I agree there is room for a client library that does reuse connections etc., but I'm afraid it is at odds with the goal of a simple http client.

Hello,

A simple http client is meant for a simple API implementation. But any API that requires several requests want to reuse the same SSL connection by default. Reusing is standard from HTTP 1.1 and HTTP 1.0 is obsolete. Therefore, we could argue that reusing should be the default.

Moreover as this library is "lwt" it would be nice if parallel requests where possible, using several connection in http 1.1 and only one in Http/2 or Http/3.

I think a slight rewrite (because it is just a redistribution of the code logic, with a bit more work for HTTP/2) could be

  • opening a connection (which would be one connection for HTTP/2 and a pool of connection for HTTP/1.1)

  • opening a channel : creates a handle to send request on a connection.

  • sending request

Then there could be above that "send_single_request" keeping the actual interface. or open_single_channel to group the two first steps.

Cheers, Christophe


In src/http_lwt_client.ml:

  • | Some { fd = Some fd; _ } ->
  •   Lwt.return (Ok (fd, add_keep_alive headers))
    

Where does this filedescriptor come from?


In src/http_lwt_unix.ml:

  •              ; mutable max : int }
    

+let new_keep_alive () = { fd = None; lock = Lwt_mutex.create (); max = 100 } + +let keep_alive_lock = function

  • | Some _r -> Lwt.return_unit (Lwt_mutex.lock r.lock)
  • | None -> Lwt.return_unit

+let keep_alive_unlock = function

  • | Some ke -> if Lwt_mutex.is_locked ke.lock then Lwt_mutex.unlock ke.lock
  • | None -> ()

+let reset_keep_alive ?(close=false) keep_alive =

  • match keep_alive with
  • | Some ke ->
  • if Lwt_mutex.is_locked ke.lock then Lwt_mutex.unlock ke.lock;
    

Is it safe to just unlock it like this?


In src/http_lwt_unix.ml:

+let must_close keep_alive =

  • match keep_alive with
  • | None -> true
  • | Some { max; _ } when max <= 1 -> true
  • | _ -> false

+let close keep_alive fd =

  • (match fd with
  • | `Plain fd ->
    
  •    if Lwt_unix.state fd <> Lwt_unix.Closed then Lwt_unix.close fd
    
  •    else Lwt.return_unit
    
  • | `Tls t -> Tls_lwt.Unix.close t)
    
  • = (fun () -> reset_keep_alive keep_alive)

+let read keep_alive fd buffer =

Here we have fd and potentially a fd in keep_alive - are we guaranteed they are the same?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

-- Christophe Raffalli web: https://raffalli.eu

craff avatar Aug 29 '24 17:08 craff

Reading the comments, I'm more and more convinced that "add keep-alive" is a major redesign of this library, and something I didn't have in mind when starting with http-lwt-client.

Of course, thread pools, keepalive, reusing existing conmections -- all that is super-useful in a decent http client library - I think I won't have time and energy in the upcoming years to review or design such a library.

I'm happy if others take on the torch, but for the sake of usability and stability, I'd appreciate to either do it in a separate project (hey, this code is ISC, you can copy and redistribute it elsewhere) or come up with a pretty conclusive design and tests. I'd as well like to hear @reynir and @dinosaure opinion on the scope of this package.

hannesm avatar Aug 29 '24 20:08 hannesm

I would agree that the scope of http-lwt-client remains fairly minimal, but I also agree that the use of keep-alive for http/1.1 is also interesting. I think that, in view of the logic required behind it, it might be wiser to offer at least one other module enabling a tcp/ip connection to be reused for several http/1.1 requests.

How about thinking about a completely different module and a (probably) better implementation without convolving on the basic module? That way, we could have the best of both worlds :) .

dinosaure avatar Sep 02 '24 08:09 dinosaure