serve issues

Automate Release: From scratch builds

1

## Description This PR supports a manually triggerable Github Action that will make an official release. The way this works is assuming a code freeze, we will run a bunch...

msaroufim

ci

GPU sharing support across models

3

### 🚀 The feature Need feature for sharing the GPU across models. It can be configured by setting the 0 < workers < 1 for a model. ### Motivation, pitch...

abhinav-cashify

[RFC]: Metrics Refactoring #1492 Draft PR

1

Fixes #1492 TorchServe defines metrics in a metrics.yaml file, including both frontend metrics (i.e. ts_metrics) and backend metrics (i.e. model_metrics). When TorchServe is started, the metrics definition is loaded in...

joshuaan7

enhancement

Missing response in continuous request test

2

### 🐛 Describe the bug Hello, I am trying to test the error cases using the postman toolkit. In the cases, I send the same wrong params continuously, however, only...

yaoing

bug

p0

Handle invalid json formatted requests without killing the worker process

1

### 🚀 The feature **Current Setup:** Currently, the TorchServe backend worker process dies whenever it receives an invalid json formatted requests. **Feature Requested:** Instead of killing the backend worker process,...

duk0011

enhancement

p2

Torchserve Workflow Fails at Medium QPS

6

We have 2 onnx models deployed in a GPU machine built on top of the nightly docker image. - The first model runs with 0 failure at 500 QPS (p99...

mossaab0

bug

perf

workflowx

Add Torchrec DLRM example

1

## Description This PR adds an example showing how to create and deploy a single GPU DLRM example with TorchRec. Because the current TorchRec version 0.2.0 needs Pytorch 1.12.0 this...

mreso

Unify Doc Automation dependencies

1

After looking into #1744 I noticed we don't actually use our `docs/sphinx/requirements.txt` in CI, this is not great because if there's issues with upstream dependencies like `markdown` it means we...

msaroufim

Register Workflows at Server Start

4

### 🚀 The feature Register workflows as part of application startup for immediate access to workflow predictions. ### Motivation, pitch Currently, workflows must be registered via the management api at...

imjohsep

enhancement

triaged

workflowx

p2

[RFC]: Torchserve Large Model Inference

2

Authors : [Hamid Shojanazeri](https://github.com/HamidShojanazeri), [Shen Li](https://github.com/mrshenli) ## **Problem statement** Currently, Torchserve does not have a general solution for serving large models for inference.The only available support is in HuggingFace(HF) [...

HamidShojanazeri

enhancement

serve
serve copied to clipboard

Metadata

Automate Release: From scratch builds

GPU sharing support across models

[RFC]: Metrics Refactoring #1492 Draft PR

Missing response in continuous request test

Handle invalid json formatted requests without killing the worker process

Torchserve Workflow Fails at Medium QPS

Add Torchrec DLRM example

Unify Doc Automation dependencies

Register Workflows at Server Start

[RFC]: Torchserve Large Model Inference

← Metadata

Owner

Metadata

serve serve copied to clipboard

Metadata

← Metadata

Owner

Metadata

serve
serve copied to clipboard