server issues

Increase SERVER_TIMEOUT for L0_infer_valgrind

6

Related PRs: common: https://github.com/triton-inference-server/common/pull/67 backend: https://github.com/triton-inference-server/backend/pull/67 tensorrt_backend: https://github.com/triton-inference-server/tensorrt_backend/pull/44

krishung5

Add L0_response_stats test

Tabrizian

openvinobackend inference only support synchronous mode? why not asynchronous mode

10

I looked at the code and found that OpenVino Backend calls Infer_request_.infer (), which is synchronous mode, can we support asyncchronous mode?

zhaohb

Triton server periodically stops responding to `ServerLive`, `ServerReady` and `RepositoryIndex` requests

8

**Description** We have an app that periodically (every 15 seconds) queries `triton-server` for liveness (`ServerLive` request), readiness (`ServerReady` request) and models info (`RepositoryIndex` request). For the past weeks, every 1...

mdnfiras

TF_SIGNATURE_DEF is not used when selected on service startup

8

**Description** Context: Trying to load a tensorflow saved model which has multiple signature definitions. Problem: The signature definition selected is not used. This happens randomly, when loading the container multiple...

ShamariYoti

bug

Refactor L0_infer for easier usage

Refactor L0_infer so future in-process APIs can use similar model generation scheme

jbkyang-nvi

How to run triton on windows10?

9

How to run triton without docker on windows10? Is there any guidelines?

LLsmile

investigating

get query_params with python backend

4

**Description** Thanks for this remarkable work, i deploy model with a variable execpt input tesnor. So i wanna to send this variable via query_params during each infer request. But i...

chenyangMl

enhancement

Build a custom python backend environment for old fashion model. How to use specific CUDA version in conda environment?

3

**Description** > I have a model built on tensorflow v1.15 (cuda=10.0, cudnn=7.4.1). I would like to register this model to current triton server (v22.03), the default CUDA version is 11.6....

jennyHsiao

Real time inference using Triton C APIs for multiple models

1

**Is your feature request related to a problem? Please describe.** We have a need to do inference using multiple models on same data. Data is streamed and real time performance...

shekhardw

server
server copied to clipboard

Metadata

Increase SERVER_TIMEOUT for L0_infer_valgrind

Add L0_response_stats test

openvinobackend inference only support synchronous mode? why not asynchronous mode

Triton server periodically stops responding to `ServerLive`, `ServerReady` and `RepositoryIndex` requests

TF_SIGNATURE_DEF is not used when selected on service startup

Refactor L0_infer for easier usage

How to run triton on windows10?

get query_params with python backend

Build a custom python backend environment for old fashion model. How to use specific CUDA version in conda environment?

Real time inference using Triton C APIs for multiple models

← Metadata

Owner

Metadata

server server copied to clipboard

Metadata

← Metadata

Owner

Metadata

server
server copied to clipboard