grpc-dotnet icon indicating copy to clipboard operation
grpc-dotnet copied to clipboard

Error starting gRPC call when using client side load balancing

Open plachta11b opened this issue 2 years ago • 9 comments

Hi, I can break sockets (maybe they break either way) somehow by turning load balancing on. After a short idle client loses the ability to communicate. There is a visible continuation of some communication in Wireshark. Some ICMP redirect packets and probably heartbeat. Any idea for a fix or workaround better than flooding the server with requests?

What version of gRPC and what language are you using?

Grpc.Net.Client 2.60.0-pre1 (same behavior in older versions)

Not sure what's on the server currently. Probably 2.59.0. The server is in my reach and could be updated if needed.

What operating system (Linux, Windows,...) and version?

OS: win10

What runtime / compiler are you using (e.g. .NET Core SDK version dotnet --info)

.NET SDK version: 8.0.100

What did you do?

I added load balancing to the GRPC client app. This app was working fine. Adding RoundRobinConfig option into:

  .ConfigureChannel(o =>
  {
      o.Credentials = credentials;
      ...
      o.ServiceConfig = new ServiceConfig
      {
          LoadBalancingConfigs = { new RoundRobinConfig() },
      };
  });

makes the app unusable after a while. This happens only when the app is idle for a minute. Increasing the request rate with the request period configured to 3 seconds (60 default) workaround this issue.

What did you expect to see?

No error.

What did you see instead?

The app works fine for a few minutes and then it throws exception:

Error starting gRPC call. HttpRequestException: Requesting HTTP version 2.0 with version policy RequestVersionOrHigher while unable to establish HTTP/2 connection.
Logs in the moment when the app breaks // service working {"level":"DEBUG","logger":"Grpc.AspNetCore.Server.ServerCallHandler","message":"Reading message."} {"level":"DEBUG","logger":"Grpc.AspNetCore.Server.ServerCallHandler","message":"Reading message."} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Starting gRPC call. Method type: 'Unary', URI: 'https://our-cluster-backend.whatever/dataservice/GetData'."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Internal.ConnectionManager","message":"Successfully picked subchannel id '2-1' with address REDACTED_URL:443. Transport status: ActiveStream"} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Starting gRPC call. Method type: 'Unary', URI: 'https://our-cluster-backend.whatever/dataservice/GetData'."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Internal.ConnectionManager","message":"Successfully picked subchannel id '2-1' with address REDACTED_URL:443. Transport status: ActiveStream"} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Sending message."} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Sending message."} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Reading message."} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Reading message."} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Finished gRPC call."} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Finished gRPC call."} {"level":"DEBUG","logger":"Grpc.AspNetCore.Server.ServerCallHandler","message":"Sending message."} {"level":"DEBUG","logger":"Grpc.AspNetCore.Server.ServerCallHandler","message":"Sending message."} {"level":"DEBUG","logger":"Grpc.AspNetCore.Server.ServerCallHandler","message":"Reading message."} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Starting gRPC call. Method type: 'Unary', URI: 'https://our-cluster-backend.whatever/dataservice/GetData'."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Internal.ConnectionManager","message":"Successfully picked subchannel id '2-1' with address REDACTED_URL:443. Transport status: ActiveStream"} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Sending message."} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Reading message."} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Finished gRPC call."} {"level":"DEBUG","logger":"Grpc.AspNetCore.Server.ServerCallHandler","message":"Sending message."} {"level":"DEBUG","logger":"Grpc.AspNetCore.Server.ServerCallHandler","message":"Reading message."} {"level":"DEBUG","logger":"Grpc.AspNetCore.Server.ServerCallHandler","message":"Reading message."} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Starting gRPC call. Method type: 'Unary', URI: 'https://our-cluster-backend.whatever/dataservice/GetData'."} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Starting gRPC call. Method type: 'Unary', URI: 'https://our-cluster-backend.whatever/dataservice/GetData'."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Internal.ConnectionManager","message":"Successfully picked subchannel id '2-1' with address REDACTED_URL:443. Transport status: ActiveStream"} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Internal.ConnectionManager","message":"Successfully picked subchannel id '2-1' with address REDACTED_URL:443. Transport status: ActiveStream"} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Sending message."} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Sending message."} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Reading message."} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Finished gRPC call."} // service not working {"level":"DEBUG","logger":"Grpc.AspNetCore.Server.ServerCallHandler","message":"Sending message."} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Reading message."} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Finished gRPC call."} {"level":"DEBUG","logger":"Grpc.AspNetCore.Server.ServerCallHandler","message":"Sending message."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Subchannel","message":"Subchannel id '2-1' state changed to Idle. Detail: 'Disconnected.'."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Internal.ConnectionManager","message":"Channel state updated to Connecting."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Internal.ConnectionManager","message":"Channel picker updated."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Subchannel","message":"Subchannel id '2-1' state changed to Connecting. Detail: 'Connection requested.'."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport","message":"Subchannel id '2-1' connected to socket REDACTED_URL:443."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Subchannel","message":"Subchannel id '2-1' state changed to Ready. Detail: 'Successfully connected to socket.'."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Internal.ConnectionManager","message":"Channel state updated to Ready."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Internal.ConnectionManager","message":"Channel picker updated."} {"level":"DEBUG","logger":"Grpc.AspNetCore.Server.ServerCallHandler","message":"Reading message."} {"level":"DEBUG","logger":"Grpc.AspNetCore.Server.ServerCallHandler","message":"Reading message."} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Starting gRPC call. Method type: 'Unary', URI: 'https://our-cluster-backend.whatever/dataservice/GetData'."} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Starting gRPC call. Method type: 'Unary', URI: 'https://our-cluster-backend.whatever/dataservice/GetData'."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Internal.ConnectionManager","message":"Successfully picked subchannel id '2-1' with address REDACTED_URL:443. Transport status: InitialSocket"} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Internal.ConnectionManager","message":"Successfully picked subchannel id '2-1' with address REDACTED_URL:443. Transport status: InitialSocket"} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Error starting gRPC call."} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Error starting gRPC call."} {"level":"INFO","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Call failed with gRPC error status. Status code: 'Internal', Message: 'Error starting gRPC call. HttpRequestException: Requesting HTTP version 2.0 with version policy RequestVersionOrHigher while unable to establish HTTP/2 connection.'.","exception":"System.Net.Http.HttpRequestException: Requesting HTTP version 2.0 with version policy RequestVersionOrHigher while unable to establish HTTP/2 connection.\r\n at System.Net.Http.HttpConnectionPool.ThrowGetVersionException(HttpRequestMessage request, Int32 desiredVersion, Exception inner)\r\n at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)\r\n at System.Net.Http.DiagnosticsHandler.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)\r\n at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)\r\n at Microsoft.Extensions.Http.Logging.LoggingHttpMessageHandler.g__Core|5_0(HttpRequestMessage request, Boolean useAsync, CancellationToken cancellationToken)\r\n at Microsoft.Extensions.Http.Logging.LoggingScopeHttpMessageHandler.g__Core|5_0(HttpRequestMessage request, Boolean useAsync, CancellationToken cancellationToken)\r\n at Grpc.Net.Client.Balancer.Internal.BalancerHttpHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)\r\n at Grpc.Net.Client.Internal.GrpcCall`2.RunCall(HttpRequestMessage request, Nullable`1 timeout)"} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Finished gRPC call."} {"level":"INFO","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Call failed with gRPC error status. Status code: 'Internal', Message: 'Error starting gRPC call. HttpRequestException: Requesting HTTP version 2.0 with version policy RequestVersionOrHigher while unable to establish HTTP/2 connection.'.","exception":"System.Net.Http.HttpRequestException: Requesting HTTP version 2.0 with version policy RequestVersionOrHigher while unable to establish HTTP/2 connection.\r\n at System.Net.Http.HttpConnectionPool.ThrowGetVersionException(HttpRequestMessage request, Int32 desiredVersion, Exception inner)\r\n at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)\r\n at System.Net.Http.DiagnosticsHandler.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)\r\n at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)\r\n at Microsoft.Extensions.Http.Logging.LoggingHttpMessageHandler.g__Core|5_0(HttpRequestMessage request, Boolean useAsync, CancellationToken cancellationToken)\r\n at Microsoft.Extensions.Http.Logging.LoggingScopeHttpMessageHandler.g__Core|5_0(HttpRequestMessage request, Boolean useAsync, CancellationToken cancellationToken)\r\n at Grpc.Net.Client.Balancer.Internal.BalancerHttpHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)\r\n at Grpc.Net.Client.Internal.GrpcCall`2.RunCall(HttpRequestMessage request, Nullable`1 timeout)"} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"gRPC call canceled."} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Finished gRPC call."} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"gRPC call canceled."} {"level":"INFO","logger":"Grpc.AspNetCore.Server.ServerCallHandler","message":"Error status code 'Internal' with detail 'Error starting gRPC call. HttpRequestException: Requesting HTTP version 2.0 with version policy RequestVersionOrHigher while unable to establish HTTP/2 connection.' raised."} {"level":"INFO","logger":"Grpc.AspNetCore.Server.ServerCallHandler","message":"Error status code 'Internal' with detail 'Error starting gRPC call. HttpRequestException: Requesting HTTP version 2.0 with version policy RequestVersionOrHigher while unable to establish HTTP/2 connection.' raised."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Subchannel","message":"Subchannel id '2-1' state changed to Idle. Detail: 'Disconnected.'."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Internal.ConnectionManager","message":"Channel state updated to Connecting."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Internal.ConnectionManager","message":"Channel picker updated."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Subchannel","message":"Subchannel id '2-1' state changed to Connecting. Detail: 'Connection requested.'."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport","message":"Subchannel id '2-1' connected to socket REDACTED_URL:443."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Subchannel","message":"Subchannel id '2-1' state changed to Ready. Detail: 'Successfully connected to socket.'."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Internal.ConnectionManager","message":"Channel state updated to Ready."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Internal.ConnectionManager","message":"Channel picker updated."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport","message":"Subchannel id '2-1' socket REDACTED_URL:443 is in a bad state and can't be used."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport","message":"Subchannel id '2-1' socket REDACTED_URL:443 is being closed because it can't be used. Socket lifetime of 00:00:35.0610584. The socket either can't receive data or it has received unexpected data."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Subchannel","message":"Subchannel id '2-1' state changed to Idle. Detail: 'Lost connection to socket.'."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Internal.ConnectionManager","message":"Channel state updated to Connecting."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Internal.ConnectionManager","message":"Channel picker updated."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Subchannel","message":"Subchannel id '2-1' state changed to Connecting. Detail: 'Connection requested.'."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport","message":"Subchannel id '2-1' connected to socket REDACTED_URL:443."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Subchannel","message":"Subchannel id '2-1' state changed to Ready. Detail: 'Successfully connected to socket.'."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Internal.ConnectionManager","message":"Channel state updated to Ready."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Internal.ConnectionManager","message":"Channel picker updated."} {"level":"DEBUG","logger":"Grpc.AspNetCore.Server.ServerCallHandler","message":"Reading message."} {"level":"DEBUG","logger":"Grpc.AspNetCore.Server.ServerCallHandler","message":"Reading message."} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Starting gRPC call. Method type: 'Unary', URI: 'https://our-cluster-backend.whatever/dataservice/GetData'."} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Starting gRPC call. Method type: 'Unary', URI: 'https://our-cluster-backend.whatever/dataservice/GetData'."} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Internal.ConnectionManager","message":"Successfully picked subchannel id '2-1' with address REDACTED_URL:443. Transport status: InitialSocket"} {"level":"DEBUG","logger":"Grpc.Net.Client.Balancer.Internal.ConnectionManager","message":"Successfully picked subchannel id '2-1' with address REDACTED_URL:443. Transport status: InitialSocket"} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Error starting gRPC call."} {"level":"DEBUG","logger":"Grpc.Net.Client.Internal.GrpcCall","message":"Error starting gRPC call."}

Anything else we should know about your project / environment?

This happens only when requests are sent to one of the Kubernetes(Openshift) clusters. I have not seen this error on the local network yet. It is probably a combination of latency, network errors, and idle time.

Seen in the app that periodically fetches some data.

Server side cfg

The issue was only seen when running against the OpenShift cluster backend. There is probably HAProxy on the way to the gRPC server pod.

EDIT: The ROUTER_DEFAULT_CLIENT_TIMEOUT or ROUTER_DEFAULT_SERVER_TIMEOUT seems to be cause of this issue.

{
  "Kestrel": {
    "Endpoints": {
      "Http": {
        "Protocols": "Http2",
        "Url": "http://+:50050"
      },
      "Https": {
        "Certificate": {
          "Path": "/https/tls-cert.pfx"
        },
        "Protocols": "Http2",
        "Url": "https://+:50051"
      }
    }
  }
}

Client side

~~Any Windows OS where our app was tested. I am using Win10 while debugging this issue.~~ Replicated with C# greeter example app on the linux base OS too.

plachta11b avatar Dec 28 '23 19:12 plachta11b

That error is really strange.

gRPC requires HTTP/2 to work. Because you're using https, the client and server negotiate that HTTP/2 is supported using HTTPS+ALPN. The error message you're seeing happens because the gRPC client requested HTTP/2 and the server said it isn't supported. The only thing I can think of that would cause this is if you have load balancing across multiple servers and some are configured to support HTTP/2 and others aren't. The error would start when client side load balancing sends a request to the server that doesn't support HTTP/2.

JamesNK avatar Jan 02 '24 04:01 JamesNK

HTTP/2 should be ok. Servers work for older client apps without issue for many years. Seems to be a client issue as this was observed with the client update. But there is an update. The same issue was seen even without load-balancing turned on. It is a new feature that was previously disabled. That means that the reported error is not caused by turning on load balancing. It seems that load balancing helps to reproduce this issue faster.

I am now testing the revert to Grpc.Net.Client 2.43.0 which was working for months in the older app. The 2.44.0 version threw the error in a few seconds. The older library seems to be ok even with LoadBalancingConfigs = { new RoundRobinConfig() }, configured.

I will push the new app with 2.43.0 to testers and post any updates.

plachta11b avatar Jan 11 '24 14:01 plachta11b

This is the change log between 2.43.0 and 2.44.0: https://github.com/grpc/grpc-dotnet/compare/v2.43.0...v2.44.0-pre1

I don't see anything that could introduce this issue.

JamesNK avatar Jan 11 '24 23:01 JamesNK

Maybe the problem is with this change in 2.43->2.44:

https://github.com/grpc/grpc-dotnet/commit/b001e02756b84f0ad8a0e6f0c4053f0490d359fb#diff-a72cbfbfa2cbe90181a0ab3d9a923d6169a808014a3aa35d17cb8a08ad7b965cL111

base.SendAsync(request, cancellationToken) call was moved out of try/catch block and there is a possibility of exception in synchronous prolog of SendAsync without calling result.SubchannelCallTracker?.Complete

vladislav-prishchepa avatar Jan 12 '24 13:01 vladislav-prishchepa

@vladislav-prishchepa With the debugger running, it seems that exceptions are thrown correctly on await responseMessageTask.ConfigureAwait(false). Moving the call into try-catch did not fix my issue. Thank you for your tip.

plachta11b avatar Jan 12 '24 15:01 plachta11b

The load balancing is disabled in 2.43.0 due to this commit https://github.com/grpc/grpc-dotnet/commit/0893e5778956759d26e63b81991fbf35e3f09b2b#diff-f00649006f4e02cb5196c3d3a3ace612115b8fcaa90c1f4034b9addff4830f38. This makes sense why the app behavior differs between 2.43.0-pre1, 2.43.0 and 2.44.0. I continue with debugging. I am still curious if LB could affect code that does not use LB or if this part of the issue is just randomly occurring.

This was most confusing as this error popped up before 2.43.0 and after too. There is no mention of this in the release notes: https://github.com/grpc/grpc-dotnet/releases/tag/v2.43.0

Things that differs when LB on:

  • reconnect policy configuration
    // grpc-dotnet\src\Grpc.Net.Client\GrpcChannel.cs
    internal const long DefaultInitialReconnectBackoffTicks = TimeSpan.TicksPerSecond * 1;
    internal const long DefaultMaxReconnectBackoffTicks = TimeSpan.TicksPerSecond * 120;
    // used then in grpc-dotnet\src\Grpc.Net.Client\GrpcChannelOptions.cs
    _maxReconnectBackoff = TimeSpan.FromTicks(GrpcChannel.DefaultMaxReconnectBackoffTicks);
    _initialReconnectBackoff = TimeSpan.FromTicks(GrpcChannel.DefaultInitialReconnectBackoffTicks);

EDIT: This should not be an issue as new sockets are opened after the connection breaks

  • usage of ConnectionManager with channel picker even when LoadBalancingConfigs not configured
  • option IsWaitForReady
  if (Options.IsWaitForReady)
  {
    message.SetOption(BalancerHttpHandler.WaitForReadyKey, true);
  }
  • DnsResolverFactory (should not matter as we are using staticResolverFactory due to https protocol usage) By default used only by ResolverFactory
    internal static readonly ResolverFactory[] KnownLoadResolverFactories = new ResolverFactory[]
    {
        // Could never refreshing be cause for a dead connection? EDIT: tested shorter without effect
        new DnsResolverFactory(Timeout.InfiniteTimeSpan)
    };

plachta11b avatar Jan 12 '24 17:01 plachta11b

I see many of these logs:

"Subchannel id \u0027{SubchannelId}\u0027 socket {EndPoint} is in a bad state and can\u0027t be used."
"Subchannel id \u0027{SubchannelId}\u0027 socket {EndPoint} is being closed because it can\u0027t be used. Socket lifetime of {SocketLifetime}. The socket either can\u0027t receive data or it has received unexpected data."

SocketLifetime=00:00:35.0570277 (usually in the range of 35-40 seconds)

I could check a few things:

  • Are there any unexpected data received? Could it be a Windows-only issue? Maybe something similar to https://github.com/dotnet/runtime/issues/60291#issuecomment-991461309. Currently seen only on Windows clients connecting to the Kestrel backend.
  • Is there something that could downgrade my connection? https://github.com/dotnet/runtime/issues/30185#issuecomment-509428562 (Need to get a deeper look at how SocketsHttpHandler. WinHttpHandler is not used in .NET8 am I right?)
  • is this https://github.com/wfurt/PcapStream a way to capture TLS traffic easily? Seems like exporting the SSLKEYLOGFILE for wireshark is not supported https://github.com/dotnet/runtime/issues/37915 at this moment.
  • Look if and where is the SocketsHttpHandler forced to HTTP/2.

I do not currently know if those logs are even related to this issue. But it seems interesting at least.

plachta11b avatar Jan 12 '24 21:01 plachta11b

The same error replicated with app that was created from GrpcGreeter example. The "Grpc.AspNetCore" nuget Version="2.62.0" was used on both server and client.

The round robin LB was configured:

builder.Services.AddGrpcClient<Greeter.GreeterClient>(o =>
{
    o.Address = new Uri("https://remote-server:443");
    o.ChannelOptionsActions.Add((o) =>
    {
        o.ServiceConfig = new ServiceConfig
        {
            // PickFirstConfig leads to the error too.
            LoadBalancingConfigs = { new RoundRobinConfig() },
        };
    });
});

There is only a single gRPC server under the given URL.

This could be triggering this issue:

plachta11b avatar Apr 25 '24 20:04 plachta11b

I have got the PcapStream running.

// <PackageReference Include="PcapStream" Version="1.0.1" />
using var capture = new NetCapture($"connection-{DateTime.UtcNow.ToString("O")}.pcap");
var socketHandler = new SocketsHttpHandler();

builder.Services.AddGrpcClient<Greeter.GreeterClient>(o =>
{
    o.Address = new Uri("https://my-server:443");
    o.ChannelOptionsActions.Add((o) =>
    {
        o.ServiceConfig = new ServiceConfig
        {
            LoadBalancingConfigs = { new RoundRobinConfig() },
        };
        o.HttpHandler = new SocketsHttpHandler()
        {
            PlaintextStreamFilter = (context, token) =>
            {
                Console.WriteLine(context.NegotiatedHttpVersion);
                Console.WriteLine(context.InitialRequestMessage);
                return new ValueTask<Stream>(capture.AddStream(context.PlaintextStream));
            },
        };
    });
});

One of the TCP streams contains this:

HTTP/1.1 408 Request Time-out
content-length: 110
cache-control: no-cache
content-type: text/html
connection: close

<html><body><h1>408 Request Time-out</h1>
Your browser didn't send a complete request in time.
</body></html>

This looks like the source of this issue. I think that the HTTP1 message could poison httpClient with HTTP1.

Expected source is Haproxy: https://www.haproxy.com/blog/haproxy-and-http-errors-408-in-chrome#haproxys-timeout-related-to-408-errors

plachta11b avatar May 09 '24 11:05 plachta11b