BentoML icon indicating copy to clipboard operation
BentoML copied to clipboard

feat: Support gRPC in BentoML API Server

Open parano opened this issue 4 years ago • 6 comments

Support using gRPC instead of HTTP API for sending prediction requests. When an API model server is deployed as a backend service, many teams prefer using gRPC over HTTP. See related discussion here

One blocker for this is the Asyncio support on gRPC end, it is currently under development https://github.com/grpc/grpc/projects/16

Also for users waiting for this feature, please notice that using gRPC does not necessarily bring better performance - Protobuf is faster when only comparing the data serialization time. But when building a model server, we need to take into consideration the computation required to turn the serialized Protobuf objects into a format that can be used by users' models. In most ML frameworks, a trained model is expecting pandas.DataFrame, np.array, tf.Tensor, or PIL.image:

JSON Request => pandas.DataFrame => Model

Protobuf Msg => Protobuf Object => pandas.DataFrame => Model

And it is this extra step of converting Protobuf message in-memory object into pandas.DataFrame, that is making it less efficient than the JSON/HTTP approach. Although gRPC does bring some edge over REST API due to the use of http/2 and compression.

Thus adding gRPC support is not for better performance, but mostly making it more convenient for teams that are already using gRPC for their backend services.

parano avatar May 20 '20 23:05 parano

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Dec 25 '20 10:12 stale[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jun 02 '21 17:06 stale[bot]

Hey, are we implementing this?

harshitsinghai77 avatar Jul 18 '21 07:07 harshitsinghai77

Hi @harshitsinghai77, I believe this is P1, and will be addressed after BentoML 1.0

aarnphm avatar Jul 18 '21 09:07 aarnphm

Got it. Let me know if you need me to work on it.

harshitsinghai77 avatar Jul 18 '21 09:07 harshitsinghai77

Working in progress, per current MLH batch. Current progress can be tracked here at https://github.com/bentoml/BentoML/tree/grpc branch and this PR https://github.com/bentoml/BentoML/pull/2808

aarnphm avatar Jul 05 '22 10:07 aarnphm

Hi all, I'm happy to say that gRPC support has been merged into main via PR #2808. We will include this in the next release as an experimental feature.

aarnphm avatar Sep 16 '22 22:09 aarnphm

gRPC streaming is the way to go next

Bec-k avatar Sep 07 '23 08:09 Bec-k

This is tracked on an adjacent ticket #4170

aarnphm avatar Sep 07 '23 14:09 aarnphm