grpc-java icon indicating copy to clipboard operation
grpc-java copied to clipboard

gRPC slower than REST for higher number of `repeated` elements

Open ashj11 opened this issue 1 year ago • 5 comments

What version of gRPC-Java are you using?

Using gRPC via https://github.com/grpc-ecosystem/grpc-spring. The grpc-java version used by that module is 1.58.0( https://github.com/grpc-ecosystem/grpc-spring/blob/a2e9520b2414277038a5a55551529f5023e8520b/docs/en/server/getting-started.md?plain=1#L52 )

What is your environment?

Reproducible in multiple environments. Have tried on my Macbook, as well as an EC2 with Ubuntu. Java 17 Client and server are on two different EC2 machines in the same VPC in the same AWS region.

What did you expect to see?

I expected gRPC to be faster even if if the response returned has high number of repeated elements.

What did you see instead?

gRPC was slower than REST with json when repeated elements are more than around 25. Below is the graph showing the trend. X axis shows the response time( p95 ) in ms. image

Steps to reproduce the bug

Test Setup Code - https://github.com/ashj11/grpc-rest-test This is a simple springboot project which serves both gRPC and REST endpoints. the server just returns a precreated json/proto object when the method is called. Grpc-Client-Page-children.py is the client which calls both gRPC and REST endpoints for 100 iterations, for every child count in 1 to 100. The requests are made sequentially. gRPC service - MyGRPCUpstreamImpl.java REST service - MyRestUpstreamImpl.java Proto definition - my_upstream.proto Found a Slack overflow link which stats this is expected. But I was surprised to see the results and would like to confirm if gRPC is not expected to handle repeated elements more than say 25.

ashj11 avatar Feb 03 '24 10:02 ashj11

We would need to play to confirm, but I doubt grpc-java has much to do with the performance you are seeing. message Child looks relatively small; small enough that I'd expect 100 repetitions to be within a ms (on a warm JVM).

The most likely explanation is the python code is using a python-powered protobuf. Protobuf in Python has two forms: python-based and c-based. The json module in Python is a c module, so you probably need to use the c-based protobuf for apples-to-apples. The main protobuf project provided both forms, and I thought the newer generated code used the c++ decoder, but for the specifics for checking/changing which you are using I am not much help.

ejona86 avatar Feb 04 '24 18:02 ejona86

Added a small note that client and server are different EC2 machines on same VPC in same AWS region.

Thanks for you reply @ejona86 . Verified on client that the protobuf implementation used by python is ubp, which is the C implementation. Command used python3 -c "from google.protobuf.internal import api_implementation; print(api_implementation.Type())" Also in the stackoverflow link that i have mentioned, the implementation is in Node.JS. It is possible this is not an issue with the java implementation per se, but with protobufs in general as mentioned in the answer there(?)

ashj11 avatar Feb 05 '24 10:02 ashj11

The stackoverflow question is adding MBs of repeated integers, which is a lot more than involved here. And Node.js would naturally favor json more than Python would, since it is pure-JS. Thanks for confirming the protobuf implementation you were testing on.

ejona86 avatar Feb 06 '24 01:02 ejona86

Also uploaded my-upstream-service-1.0-SNAPSHOT.jar so that you can try running the service with java -jar command.

ashj11 avatar Feb 06 '24 05:02 ashj11

Any update on this?

ashj11 avatar Feb 16 '24 04:02 ashj11

(You probably saw an earlier response in your email, now deleted. I was accidentally testing with Grpc-Client.py, so it was sort of useless. This was done with a hacked Grpc-Client.py that was changed to call GetPage/pageinfo; I got some errors I didn't want to bother with on the other one. I also set NUM_OF_CHILDREN = 10000 when building the server)

The slowdown definitely seems to be on the Python side. I am worried about slow JVM warmup, but it is minor here compared to the slowdown on the client.

I see you are using grpc_requests. That is not the normal client, and it may be the cause of the slow-down here. It might not cause a slow-down since even the normal Python processing uses reflection information, but it does mean it isn't worth me digging in too deeply.

NUM_OF_CHILDREN = 10000 with full Python-side decoding (not fair):

Rest p95 : 44.494299999999996
GRPC p95 : 228.44089999999997

NUM_OF_CHILDREN = 10000 using client.request(service, method, {}, raw_output=True) to disable grpc_requests decoding (fair):

Rest p95 : 43.82405
GRPC p95 : 19.9179

Since the server responded in ~20 ms, that means ~200 ms of latency was on the Python side. This proves the problem isn't with Java. Also, this comparison is fair, since requests isn't decoding the JSON either.

To compare them both decoding, NUM_OF_CHILDREN = 10000 with original grpc decoding and adding a requests.json() for REST (fair):

Rest p95 : 51.73215
GRPC p95 : 227.7057

I don't think there's any more to do here from the Java side.

ejona86 avatar Mar 22 '24 23:03 ejona86

No activity on the issue in a couple of weeks, I'm assuming this is resolved. If not, we can reopen.

temawi avatar Apr 10 '24 16:04 temawi