serve icon indicating copy to clipboard operation
serve copied to clipboard

Serve, optimize and scale PyTorch models in production

Results 432 serve issues
Sort by recently updated
recently updated
newest added

## Description This PR supports a manually triggerable Github Action that will make an official release. The way this works is assuming a code freeze, we will run a bunch...

ci

### 🚀 The feature Need feature for sharing the GPU across models. It can be configured by setting the 0 < workers < 1 for a model. ### Motivation, pitch...

Fixes #1492 TorchServe defines metrics in a metrics.yaml file, including both frontend metrics (i.e. ts_metrics) and backend metrics (i.e. model_metrics). When TorchServe is started, the metrics definition is loaded in...

enhancement

### 🐛 Describe the bug Hello, I am trying to test the error cases using the postman toolkit. In the cases, I send the same wrong params continuously, however, only...

bug
p0

### 🚀 The feature **Current Setup:** Currently, the TorchServe backend worker process dies whenever it receives an invalid json formatted requests. **Feature Requested:** Instead of killing the backend worker process,...

enhancement
p2

We have 2 onnx models deployed in a GPU machine built on top of the nightly docker image. - The first model runs with 0 failure at 500 QPS (p99...

bug
perf
workflowx

## Description This PR adds an example showing how to create and deploy a single GPU DLRM example with TorchRec. Because the current TorchRec version 0.2.0 needs Pytorch 1.12.0 this...

After looking into #1744 I noticed we don't actually use our `docs/sphinx/requirements.txt` in CI, this is not great because if there's issues with upstream dependencies like `markdown` it means we...

### 🚀 The feature Register workflows as part of application startup for immediate access to workflow predictions. ### Motivation, pitch Currently, workflows must be registered via the management api at...

enhancement
triaged
workflowx
p2

Authors : [Hamid Shojanazeri](https://github.com/HamidShojanazeri), [Shen Li](https://github.com/mrshenli) ## **Problem statement** Currently, Torchserve does not have a general solution for serving large models for inference.The only available support is in HuggingFace(HF) [...

enhancement