FSharp.Data icon indicating copy to clipboard operation
FSharp.Data copied to clipboard

Use System.Net.Http.HttpClient

Open Tarmil opened this issue 3 years ago • 13 comments

This was already proposed in #93, but given how much the landscape has changed since 2013, I thought it better to create a new issue.

The specific reason why I'm proposing it right now is to be able to use FSharp.Data in WebAssembly with Bolero; see this issue. But more generally I think it would be good to update to the modern and recommended library.

Tarmil avatar Aug 28 '21 22:08 Tarmil

Agreed! Would gladly accept a pull request here.

cartermp avatar Aug 29 '21 02:08 cartermp

Hey, I'm new here but I figured I'd give a go at this. I encountered an issue, but I'm not sure how to continue.

HttpClient in .netstandard 2.0 has issues that were fixed in later versions. Basically, you have two problems:

  • If you use one HttpClient per request, you are very likely to run into port exhaustion issues since even disposing the HttpClient does not free the port. The OS would eventually free the ports, but it's a common problem if you do many requests.
  • If you use a single HttpClient instance for all requests, you could potentially run into stale DNS, since the DNS could refresh during the lifetime of the application.

This was fixed in .Net Core 2.1 and later using SocketsHttpHandler. The solution is then to use a single HttpClient using SocketsHttpHandler, which avoids the stale DNS issue. To use this in FSharp.Data, the framework of this library would have to be upgraded and it will no longer support .net framework (since .netstandard 2.0 is the latest version that supports .net framework).

I've come to the following solutions and I'm not sure which I should follow:

  1. Upgrade .Net and stop supporting .netframework
  2. Choose either "bad" solution. Maybe add a way for the user of the library to specify which "problem" they want to handle?
  3. Add some kind of solution that periodically refreshes the HttpClient to avoid the stale DNS issue.
  4. I've found backports of SocketsHttpHandler to .netstandard 2.0. They are available in nuget packages, but I'm wary of adding dependencies to the project.
  5. Write a backport for .netstandards 2.0. The implementation looks fairly complex and is probably error-prone.

I'm willing to do 1-4, but I'm less sure about how to implement 3. 5 looks a little too complex for my liking. I'm open for other solutions as well, if you can find others.

I think it would be a good idea to have a maintainer pitch in.

AlexGagne avatar Oct 27 '21 14:10 AlexGagne

I don't want to add an IWhatever using the AddHttpClient extension method on IServiceCollection. I want to send a flippin' web request without shooting myself in the foot. The current version of FSharp.Data is awesome at that. I really hope we don't lose the ability to send a web request in a single line of code!

aaronmu avatar Nov 02 '21 09:11 aaronmu

@aaronmu this is about replacing internals of FSharp.Data, not what you interface with.

cartermp avatar Nov 03 '21 15:11 cartermp

@cartermp Do you have an opinion on my comment? I would like to work on adding the HttpClient, but it requires a change that could be breaking and I'd like to know a maintainer's opinion on the subject

AlexGagne avatar Nov 08 '21 14:11 AlexGagne

This was fixed in .Net Core 2.1 and later using SocketsHttpHandler

This makes me scream "UGH" into a pillow.

I don't think we're at the point where we can drop support for .NET Standard 2.0 in this package. I wish it were so, because then this would be much simpler.

Since I don't work at Microsoft anymore I can't get stats on relative usage of .NET FX with F# vs. .NET. I know that with .NET 5 there were more F# developers targeting modern .NET than .NET Framework, and so maybe that continues enough to where we can cut another major version and break compat.

@dsyme what do you think? And could you possibly look into usage from Visual Studio to see?

cartermp avatar Nov 09 '21 17:11 cartermp

Could we #if NETSTANDARD21 the use of HttpClient and have both netstandard2.0 and netstandard2.1 tfms?

dsyme avatar Nov 09 '21 22:11 dsyme

We could, but that would lead to either:

  • An implementation in NS 2.0 that is buggy
  • More complicated code with different code paths depending on TFM

I'm really not happy with either approach, but I'd pick the latter if it came down to it.

cartermp avatar Nov 09 '21 22:11 cartermp

I just had a deep dive into this, naively thinking this might no be too hard to do if we were happy with multiple TFM's @AlexGagne your deductions on the .net standard 2.0 issues agreed with my research. @cartermp keep that pillow close...the pain keeps coming.

So I thought I'd have a crack at a possible implementation (as I just hit needing FSharp.Data in blazor too) @Tarmil I had a look at your attempt years back for inspiration, but it appears to be a short term fix for your needs at the time ?

Now I'm a fair noob with HttpClient not having used it much & relying in FSharp.Data for all my http work to date, so I could be misguided in my summations here.

TLDR - IMO, To make proper use of httpClient (say in terms of socket reuse) the public API of FSharp.Data.Http really needs a restructure. It is somewhat possible to integrate HttpClient without changing much of the public API but we will have to give up some of the goodness HttpClient offers, and make a bunch of assumptions for the caller, which just feels awful.

To avoid tuning this into an essay, here are some dot points to chew on

  • FSharp.Data.Http currently uses arg customizeHttpRequest: HttpWebRequest to enable additional configuration beyond what Http provides. Even if this callback remains as part of the signature, the types would have to change to something like HttpClient, HttpMessageHandler, HttpRequestMessage or some ADT equivalent
  • CookieContainers and things like Proxies are configured though HttpMessageHandlers. HttpClient does not manage them directly. An SOC which isn't apparent with FSharp.Data.Http method overloads
  • HttpClient has 3 ctors. The parameterless ctor defaults to different HttpMessageHandler depending on the platform (NSUrlSessionHandler for Xaramin on Apple for example) and not all HttpMessageHandlers have the notion of a CookieContainer
  • HttpMessageHandlers can be supplied to the other 2 ctors of HttpClient but once the first Http call is made, changing any of the settings on the HttpMessageHandler has no effect, and changes against HttpClient directly will throw
  • So the notion of repeatedly passing in CookieContainers and Cookies with each call to a static FSharp.Data.Http makes little sense

Some of ideas so far -

  • I think FSharp.Data.Http should become more of a "Builder/Factory" returning wrapped HttpClient instances that are preconfigured with Proxies, CookieContainers etc.
  • These returned instances are then what the caller makes the actual Http calls against. Passing in url's, headers, query strings etc as required
  • The Http calls of HttpClient are threadsafe, so once an instance is returned from the builder, it would then be possible to generate any number of async http calls and simply throw them at Async.Parallel (as an example) and all without exhausting sockets!
  • This approach gives the flexibility of generating separate instances with different HttpMessageHandler configs, and then throwing requests at any one in any way the caller needs.

If .net standard 2.0 is still targeted, I don't think it would make FSharp.Data.Http more buggy. Agreed a caller on two different platforms would get a different result, but this is not really FSharp.Data.Http's problem. It is a runtime problem. I use many packages that have idiosyncrasy between platforms and if the reasons why are called out in the docs, I'm fair warned and more accepting.

I'm happy to participate should anyone else be willing wish to chew this off, as It would definitely be something I would use.

ordinaryorange avatar Apr 01 '22 06:04 ordinaryorange

In your suggestion, would using the Factory part be optional? My main use-case is simple calls with little configuration (I'm not sure if this is the main use case of people who use FSharp.Data.Http), so if there was a default HttpClient used in the simple cases, when HttpClient is not specified, that would work for me.

I'm a little bit wary for existing users who can only support .netstandard 2.0. It's not just that the behaviour is different between TFMs, but it's that the next version introduces a possible bug (stale DNS) in existing code unless the user makes changes. Even then, there is no correct solution the user can choose. They can use an HttpClient per request, leading to a possible port exhaustion, or reuse a single HttpClient and possibly lead into a stale DNS. If I was stuck on .netstandard 2.0, I would probably just not upgrade since the current version does not run into these issues.

AlexGagne avatar Apr 05 '22 18:04 AlexGagne

In your suggestion, would using the Factory part be optional?

I don't see why not, HttpClient has a parameterless ctor. Have not thought through any implications, but I agree any changes need to make sure use of FSharp.Data.Http remain pain free for the caller

but it's that the next version introduces a possible bug (stale DNS) in existing code unless the user makes changes

If my assertions are correct, and that the FSharp.Data.Http API needs to change, then said users would have to make changes to their code base as the API surface will have changed.

Just researching the DNS matter a bit more. This article suggests stale DNS could be a problem even on platforms that use SocketsHttpHandler under the hood by default. As active connections are never closed. The caller has to explicitly deal with it.

According to the docs on HttpClient HttpWebRequest is used under the hood for .NET Framework, but I cant find any information on how DNS is handled here. I wonder if it suffers the same DNS issues already making the DNS discussion moot ?

ordinaryorange avatar Apr 06 '22 03:04 ordinaryorange

I ran some tests on the current implementation to see where this is at, and it appears socket exhaustion is already a potential problem with the current HttpWebRequest based implementation.

Based on that original HttpClient article I ran a simple program using the existing FSharp.Data.Http utility, ran netstat.exe, and lo and behold, I had a bunch of TIME_WAIT connections hanging out there:

The program:

open FSharp.Data

for i in [ 1..100 ] do
    Http.Request("http://something.local")  |> ignore

Netstat result:

  TCP    192.168.1.238:62635    something.local:9090           TIME_WAIT
  TCP    192.168.1.238:62636    something.local:9090           TIME_WAIT
  TCP    192.168.1.238:62637    something.local:9090           TIME_WAIT
  TCP    192.168.1.238:62638    something.local:9090           TIME_WAIT
  TCP    192.168.1.238:62639    something.local:9090           TIME_WAIT
  ... (truncated for brevity, but all ports in this range were used as well)
  TCP    192.168.1.238:62733    something.local:9090           TIME_WAIT
  TCP    192.168.1.238:62734    something.local:9090           TIME_WAIT

So, even if the library wrapped HttpClient and instantiated and disposed of it every time, it appears that the developer experience would not be degraded compared to the current implementation. But that's the bare minimum approach, and not likely the preferred solution. Instead, based on the Recommended Use section of the "HttpClient Guidelines", it appears the library has to consider the underlying Target Framework to some degree in order to follow best practice, that being:

  • .NET Core 2.1+: A static, shared instance of HttpClient using SocketsHttpHandler with decent defaults for PooledConnectionLifetime
  • Anything else: Use and dispose of new instances for every request, and strongly encourage the relying developer to consider using IHttpClientFactory (else we run into stale DNS)

Based on that, I would advocate for reshaping this library around the assumption that the relying developers are the best people to manage the HttpClient instances, since their use cases are many and varied, and those will greatly affect their approach to HttpClient usage. Some outcomes of this might be:

  • Adding an overload to the Load method on Type Providers that takes in an HttpClient or IHttpClientFactory alongside a URL
  • Reshaping the Http module to be focused on Functional extensions to HttpClient and HttpResponseMessage (i.e. wrapping exceptions and failure status codes into Result), rather than an All-purpose HTTP library. I would argue that HttpClient already does this better than HttpWebRequest did, so FSharp's focus should be making functional extensions to the existing and best way to handle HTTP.
    • Note: this would obviously be a breaking change, so it might be better handled by deprecating the old FSharp.Data.Http module and creating a new one that is focused on making HttpClient usage more functional.

Anyway, those are just my thoughts, I would love to hear others' opinions on all this!

kjreills avatar Oct 06 '22 17:10 kjreills

Hmmmm interesting. Would be good to hear what other people think. It feels like we should at least make sure that we're following the recommended use (we should assume .NET Core 2.1+)

So adding the extra overloads sounds about right

dsyme avatar Oct 27 '22 13:10 dsyme