serve issues

Worker thread stuck in die state

29

### 🐛 Describe the bug We encountered similar problems as https://github.com/pytorch/serve/issues/1531 and it happens quite often. See logs below. We have two workers(9000, 9001) for this model. After worker 9000...

hgong-snap

bug

p0

Add ONNX and TensorRT support

2

## This PR * a `--serialized-file` that's in `.onnx` format, which will be correctly loaded by the base handler using an `ort.InferenceSession()` https://pytorch.org/tutorials/advanced/super_resolution_with_onnxruntime.html * TensorRT format is a `.ts` extension...

msaroufim

TorchServe no prediction when input data gets bigger (Backend worker did not respond in given time)

10

### 🐛 Describe the bug I am passing JSON data to python-requests. For simplification you can assume the following input: ``` dic1 = {"main": "this is a main", "categories": "this...

tednaseri

triaged_wait

gpu

add client timeout configuration

### 🚀 The feature client_timeout is used for TS to stop processing a request asap. ### Motivation, pitch This feature is able to reduce the inference failure rate on client...

lxning

enhancement

perf

CPP OTFProtocol implementation for inference request and response

This change implements - Deserializing inference request to `InfereneceRequestBatch` - Serialize `InferenceResponseBatch` and write to socket in OTF protocol Testing: - Tested inference request deserialization by passing a sample inference...

rohithkrn

c++

Added dev mode to `install_from_src.py`

1

## Description I wanted to add a `-e` mode to our `pip install .` scripts so I could make local changes to python files and immediately have them reflected without...

msaroufim

cpp backend logging and load model

## Description * Folly log integration: I tried to implement a factory pattern to keep logging decoupled to some extent but Generics and Macros were giving a lot of trouble....

maaquib

c++

HF pipeline integration - Part 1

6

@msaroufim @HamidShojanazeri As discussed in #1818 , I have uploaded all the relevant files so far. Tasks: - [ ] Add preliminary support for [HF pipeline](https://huggingface.co/docs/transformers/main_classes/pipelines) in existing handler code,...

tripathiarpan20

Support for KServe V2 gRPC Prediction API?

I've seen that TorchServe [now supports](https://github.com/pytorch/serve/pull/1190) the [KServe V2 Prediction API](https://github.com/kserve/kserve/blob/master/docs/predict-api/v2) but as far as I can see, this is only the REST flavour of it, via the `kservev2` service...

njhill

enhancement

triaged

kubernetes

mypy support

Added support for typing using mypy which can help detect all sorts of subtle bugs and make it easier to understand the python part of the codebase I also typed...

msaroufim

serve
serve copied to clipboard

Metadata

Worker thread stuck in die state

Add ONNX and TensorRT support

TorchServe no prediction when input data gets bigger (Backend worker did not respond in given time)

add client timeout configuration

CPP OTFProtocol implementation for inference request and response

Added dev mode to `install_from_src.py`

cpp backend logging and load model

HF pipeline integration - Part 1

Support for KServe V2 gRPC Prediction API?

mypy support

← Metadata

Owner

Metadata

serve serve copied to clipboard

Metadata

← Metadata

Owner

Metadata

serve
serve copied to clipboard