server issues

24.08 - Update versions

Thanks for submitting a PR to Triton! Please go the the `Preview` tab above this description box and select the appropriate sub-template: * [PR description template for Triton Engineers](?expand=1&template=pull_request_template_internal_contrib.md) *...

mc-nv

Stateful decoupled bls model: malloc_consolidate(): unaligned fastbin chunk detected

**Description** I am getting memory corruption issues with stateful bls model, It seems like triton is trying to free some memory which is still in use **Triton Information** 24.07 Are...

007durgesh219

SSLEOFError when result from async_infer is not available in http client

We are running `tritonclient[http]=2.41.0` with server running `24.06-py3`. When there are O(600) requests reaching the server we intermittently receive the following error from triton: ``` Traceback (most recent call last):...

briedel

Allow introspection and static analysis of `pb_utils` (Python backend)

4

**Is your feature request related to a problem? Please describe.** When writing the `model.py` file for a Python backend model, it is very difficult to correctly use `triton_python_backend_utils` (aka `pb_utils`)....

ClaytonJY

enhancement

Incorrect data received on python backend from client.

1

**Description** I am testing sending data received as output from one model as input to my python backend to post process (I will eventually do an ensemble later) The problem...

Chappie74

Issue while setting up ONNX RUNTIME BACKEND natively on Windows 10.

1

**Description** I am trying to setup and and build ONNX runtime natively on Windows 10, without docker following the instructions that are mentioned in the [readme ](https://github.com/triton-inference-server/onnxruntime_backend/blob/main/README.md )file of the...

saugatapaul1010

module: platforms

Is inferencing natively with C++ natively supported in Triton For Windows version 2.47 and ONNX backend? (Without GRPC and HTTPs calls.

2

**Description** Hi, I have setup Triton version 2.47 for Windows, along with ONNX runtime backend, based on the assets for Triton 2.47 that are mentioned in this URL : https://github.com/triton-inference-server/server/releases/...

saugatapaul1010

module: platforms

High latency with Triton Inference Server

1

Description of problem: I did some experiments to measure timing performance to compare standalone inference based on a TensorRT model vs Triton serving the TensorRT model using identical input on...

adrian-tsang-elucid

allow model parameters to be specified in ensemble config

2

**Is your feature request related to a problem? Please describe.** I have python components that I would like to use in multiple ensembles (both within a container but also in...

charlesmelby

enhancement

Queue timeouts not working as expected

**Description** A clear and concise description of what the bug is. Timeout value defined in config.pbtxt is not triggered on defined value, but after the model has finished its current...

sboudouk

server
server copied to clipboard

Metadata

24.08 - Update versions

Stateful decoupled bls model: malloc_consolidate(): unaligned fastbin chunk detected

SSLEOFError when result from async_infer is not available in http client

Allow introspection and static analysis of `pb_utils` (Python backend)

Incorrect data received on python backend from client.

Issue while setting up ONNX RUNTIME BACKEND natively on Windows 10.

Is inferencing natively with C++ natively supported in Triton For Windows version 2.47 and ONNX backend? (Without GRPC and HTTPs calls.

High latency with Triton Inference Server

allow model parameters to be specified in ensemble config

Queue timeouts not working as expected

← Metadata

Owner

Metadata

server server copied to clipboard

Metadata

← Metadata

Owner

Metadata

server
server copied to clipboard