serve
                                
                                
                                
                                    serve copied to clipboard
                            
                            
                            
                        Serve, optimize and scale PyTorch models in production
### 🐛 Describe the bug We encountered similar problems as https://github.com/pytorch/serve/issues/1531 and it happens quite often. See logs below. We have two workers(9000, 9001) for this model. After worker 9000...
## This PR * a `--serialized-file` that's in `.onnx` format, which will be correctly loaded by the base handler using an `ort.InferenceSession()` https://pytorch.org/tutorials/advanced/super_resolution_with_onnxruntime.html * TensorRT format is a `.ts` extension...
TorchServe no prediction when input data gets bigger (Backend worker did not respond in given time)
### 🐛 Describe the bug I am passing JSON data to python-requests. For simplification you can assume the following input: ``` dic1 = {"main": "this is a main", "categories": "this...
### 🚀 The feature client_timeout is used for TS to stop processing a request asap. ### Motivation, pitch This feature is able to reduce the inference failure rate on client...
This change implements - Deserializing inference request to `InfereneceRequestBatch` - Serialize `InferenceResponseBatch` and write to socket in OTF protocol Testing: - Tested inference request deserialization by passing a sample inference...
## Description I wanted to add a `-e` mode to our `pip install .` scripts so I could make local changes to python files and immediately have them reflected without...
## Description * Folly log integration: I tried to implement a factory pattern to keep logging decoupled to some extent but Generics and Macros were giving a lot of trouble....
@msaroufim @HamidShojanazeri As discussed in #1818 , I have uploaded all the relevant files so far. Tasks: - [ ] Add preliminary support for [HF pipeline](https://huggingface.co/docs/transformers/main_classes/pipelines) in existing handler code,...
I've seen that TorchServe [now supports](https://github.com/pytorch/serve/pull/1190) the [KServe V2 Prediction API](https://github.com/kserve/kserve/blob/master/docs/predict-api/v2) but as far as I can see, this is only the REST flavour of it, via the `kservev2` service...
Added support for typing using mypy which can help detect all sorts of subtle bugs and make it easier to understand the python part of the codebase I also typed...