LitServe Map `decode_request` during dynamic batching using a threadpool

Map `decode_request` during dynamic batching using a threadpool

Open aniketmaurya opened this issue 1 year ago • 2 comments

trafficstars

🚀 Feature

A default optimization that LitServe can provide users is to map the decode_request function in case of dynamic batching using a ThreadPool. This can be useful for cases like image loading which is IO based.

I did a quick test with a ResNet-152 model for image classification and observed the following throughput (Requests per second) performance gain with threadpool:

Motivation

Pitch

Alternatives

Additional context

Jul 08 '24 16:07 aniketmaurya

hi @aniketmaurya have you already thought of an implementation of this?

I'd be interested in implementing it.

Sep 21 '24 14:09 grumpyp

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Apr 16 '25 06:04 stale[bot]

LitServe LitServe copied to clipboard

Map `decode_request` during dynamic batching using a threadpool

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

LitServe
LitServe copied to clipboard