LitServe
LitServe copied to clipboard
Map `decode_request` during dynamic batching using a threadpool
🚀 Feature
A default optimization that LitServe can provide users is to map the decode_request function in case of dynamic batching using a ThreadPool. This can be useful for cases like image loading which is IO based.
I did a quick test with a ResNet-152 model for image classification and observed the following throughput (Requests per second) performance gain with threadpool:
Motivation
Pitch
Alternatives
Additional context
hi @aniketmaurya have you already thought of an implementation of this?
I'd be interested in implementing it.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.