grpc-dotnet icon indicating copy to clipboard operation
grpc-dotnet copied to clipboard

Possible performance regression (vs unmanaged server) if server handler faults

Open mgravell opened this issue 1 year ago • 1 comments

What version of gRPC and what language are you using?

2.49.0, C#

What operating system (Linux, Windows,...) and version?

Windows 11

What runtime / compiler are you using (e.g. .NET Core SDK version dotnet --info)

7.0.2

What did you do?

Investigating server performance in the "fault" case, using a unary handler; exact scenario is here (despite the location, this doesn't use protobuf-net[.Grpc] at all - this is a pure gRPC example)

In this case, "fault" means the handler throws (same numbers obtained from sync throw vs returning faulted task), via the FAIL defined at the top of the server files.

Client and Server are the unmanaged client and server (targeting net48 and net7); CurrentClient and CurrentServer are the managed client and server (targeting net7 only). Logging has been disabled via Grpc: None, and the server console is silent (i.e. this isn't a console logging issue). Permutations can be run simply at the command line (all client/server combinations are configured to use the same endpoint). TLS is not enabled.

success case, unmanaged server, unmanaged client:

  • net48 client, net48 server: 197kops/s
  • net7 client, net48 server: 204kops/s
  • net48 client, net7 server: 377kops/s
  • net7 client, net7 server: 393kops/s

failure case, unmanaged server, unmanaged client:

  • net48 client, net48 server: 154kops/s
  • net7 client, net48 server: 166kops/s
  • net48 client, net7 server: 197kops/s
  • net7 client, net7 server: 223kops/s

success case, managed server, managed client:

  • net7 client, net7 server: 494kops/s

failure case, managed server, managed client:

  • net7 client, net7 server: 116kops/s

What did you expect to see?

the failure case might perhaps have lower performance than success, but within acceptable bounds

What did you see instead?

the failure case is significantly slower (approx 1/4 throughput) than the success case, and slower than the unmanaged server

Anything else we should know about your project / environment?

mgravell avatar Feb 01 '23 13:02 mgravell