serve icon indicating copy to clipboard operation
serve copied to clipboard

Why is TorchServe No Longer Actively Maintained?

Open ily666666 opened this issue 9 months ago • 11 comments

Hello, I noticed that the TorchServe GitHub page has been marked as 'Limited Maintenance,' indicating that the project is no longer actively maintained. Could you share the reasons behind this decision? Is it related to the development direction of the PyTorch ecosystem? Additionally, are there any recommended alternative tools or solutions for deploying PyTorch models? 
Thank you for your response!

ily666666 avatar Mar 03 '25 02:03 ily666666

Could you clarify your decision? What do you plan to use in the future?

zagorulkinde avatar Mar 03 '25 10:03 zagorulkinde

If torchserve is no longer the way to serve PyTorch models, what else is out there?

sapphire008 avatar Mar 05 '25 03:03 sapphire008

The best like for like replacement is probably Nvidia Triton with the pytorch backend right now I think

michaeltinsley avatar Mar 09 '25 08:03 michaeltinsley

Developer of LitServe here - LitServe has similar API interface and on-par performance so super easy to port your application. https://lightning.ai/docs/litserve/home/benchmarks

aniketmaurya avatar Mar 10 '25 19:03 aniketmaurya

Just want to share a list of resources to go from here...

Ray Serve (https://docs.ray.io/en/latest/serve/index.html) Triton Inference Server ( https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html ) MLflow (https://mlflow.org/) BentoML (https://www.bentoml.com/) KServe (https://kserve.github.io/website/latest/) Seldon (https://www.seldon.io/solutions/seldon-mlserver/) Cortex (https://www.cortex.dev/) ForestFlow (https://github.com/ForestFlow/ForestFlow) TensorFlow Serving (https://www.tensorflow.org/tfx/guide/serving) DeepDetect (https://www.deepdetect.com/overview/introduction) Multi Model Server (MMS) (https://github.com/awslabs/multi-model-server)

So far I looked at BentoML and RayServe. Here are my thoughts:

  • BentoML seems to have a paywall BentoMLCloud which you cannot self host. But it has nice features like model management and a triton inference server interface
  • RayServe also has a triton inference runtime and model management

in LiteServe I cannot find any model managment (?)

If you are also comming from TorchServe and like OpenSource, the packaging and management api (like me =) feel free to share your experiences/research or correct or add anything, I'm still searching/researching.

Cheers.

whplh avatar Mar 13 '25 10:03 whplh

Does anyone have any recommendations for alternative frameworks that allow per-model user-provided code like torchserve's handler.py?

geodavic avatar Mar 14 '25 18:03 geodavic

The "no longer actively maintained" notice should include a date. Especially true for the documentation at pytorch.org/serve

Without having to dig I should be able to determine how recently the project has been abandoned. Thankfully google found this repo and the list of releases has the latest release in 2024-09

drjasonharrison avatar Mar 19 '25 21:03 drjasonharrison

Does anyone have any recommendations for alternative frameworks that allow per-model user-provided code like torchserve's handler.py?

Hi @geodavic, I’d recommend LitServe as a great alternative. As a contributor, I can say it offers a user-friendly interface for serving models with excellent performance. Feel free to try it out and let me know if you have any questions! 😊

bhimrazy avatar Mar 24 '25 11:03 bhimrazy

@bhimrazy What I miss TorchServe is it support seperate endpoints and different GPU configuration for multiple models. However LitServe doesn't according to https://lightning.ai/docs/litserve/features/multiple-endpoints#multiple-routes.

yuzhichang avatar Mar 24 '25 13:03 yuzhichang

Hi @yuzhichang,
By design, LitServe is kept simple yet performant.

Btw, you can easily configure devices, GPUs, and workers while setting up the LitServer (see: LitServer Devices).

For multiple endpoints, I’d suggest creating a Docker image for each endpoint and serving them that way. If you’d like to share any thoughts or use cases on multiple endpoints, feel free to add them to this issue: #271. 😊

bhimrazy avatar Mar 24 '25 19:03 bhimrazy

Just want to share a list of resources to go from here...

...

If you are also comming from TorchServe and like OpenSource, the packaging and management api (like me =) feel free to share your experiences/research or correct or add anything, I'm still searching/researching.

Cheers.

@whplh This is list missing OpenVINO Model Server

If your plan is to deploy on Intel CPU, iGPU, GPU or NPU - this is a way to go. It has support for popular APIs like KServe (both gRPC and REST) and also OpenAI API if you plan on serving LLMs. And PyTorch models will work out of the box when converted to ONNX or OpenVINO IR format.

dkalinowski avatar Apr 09 '25 08:04 dkalinowski