serve icon indicating copy to clipboard operation
serve copied to clipboard

Serve, optimize and scale PyTorch models in production

Results 432 serve issues
Sort by recently updated
recently updated
newest added

### 🐛 Describe the bug We encountered similar problems as https://github.com/pytorch/serve/issues/1531 and it happens quite often. See logs below. We have two workers(9000, 9001) for this model. After worker 9000...

bug
p0

## This PR * a `--serialized-file` that's in `.onnx` format, which will be correctly loaded by the base handler using an `ort.InferenceSession()` https://pytorch.org/tutorials/advanced/super_resolution_with_onnxruntime.html * TensorRT format is a `.ts` extension...

### 🐛 Describe the bug I am passing JSON data to python-requests. For simplification you can assume the following input: ``` dic1 = {"main": "this is a main", "categories": "this...

triaged_wait
gpu

### 🚀 The feature client_timeout is used for TS to stop processing a request asap. ### Motivation, pitch This feature is able to reduce the inference failure rate on client...

enhancement
perf

This change implements - Deserializing inference request to `InfereneceRequestBatch` - Serialize `InferenceResponseBatch` and write to socket in OTF protocol Testing: - Tested inference request deserialization by passing a sample inference...

c++

## Description I wanted to add a `-e` mode to our `pip install .` scripts so I could make local changes to python files and immediately have them reflected without...

## Description * Folly log integration: I tried to implement a factory pattern to keep logging decoupled to some extent but Generics and Macros were giving a lot of trouble....

c++

@msaroufim @HamidShojanazeri As discussed in #1818 , I have uploaded all the relevant files so far. Tasks: - [ ] Add preliminary support for [HF pipeline](https://huggingface.co/docs/transformers/main_classes/pipelines) in existing handler code,...

I've seen that TorchServe [now supports](https://github.com/pytorch/serve/pull/1190) the [KServe V2 Prediction API](https://github.com/kserve/kserve/blob/master/docs/predict-api/v2) but as far as I can see, this is only the REST flavour of it, via the `kservev2` service...

enhancement
triaged
kubernetes

Added support for typing using mypy which can help detect all sorts of subtle bugs and make it easier to understand the python part of the codebase I also typed...