GuanLuo issues

Results 9 issues of


                                            GuanLuo

Implement GetBackendAttribute to provide instance kind hint

core: https://github.com/triton-inference-server/core/pull/109 example: if model has below config and build with ENABLE_GPU=OFF: ``` name: "add_sub" backend: "python" input [ ... ] output [ ... ] # not providing instance group...

In Dockerfile gen script, CUDNN_VERSION should be obtained from docker image

**Description** The generation looks for ["CUDNN_VERSION" environment variable on host system](https://github.com/triton-inference-server/onnxruntime_backend/blob/main/tools/gen_ort_dockerfile.py#L429-L435) at first, and later use the [version in docker image](https://github.com/triton-inference-server/onnxruntime_backend/blob/main/tools/gen_ort_dockerfile.py#L94-L98). CUDNN ships with the docker image so it may...

Fix L0_backend_tutorial intermittent due to request ordering

What does negative dimension imply

# Ask a Question ### Question I was toggling the node parameters for testing and I noticed that the ONNX checker doesn't complain on the following model whose output has...

question

enhancement

onnx checker

no-issue-activity

Allow non-decoupled model to send response and FINAL flag separately

Follow up on https://github.com/triton-inference-server/core/pull/229 For custom backend, one may call send response in the following style, even for "non-decoupled" model ``` TRITONBACKEND_ResponseSend(response, 0, nullptr /* success */); TRITONBACKEND_ResponseFactorySendFlags(factory, TRITONSERVER_RESPONSE_COMPLETE_FINAL); ```...

investigating

[DO NOT MERGE] Python binding of Triton server C API

This PR provides the low level binding for Python user to interact with Triton library within the same process, however, this binding is not intended for Python user to use...

[DO NOT MERGE] Python binding / wrapper look and feel

`FIXME` are the sections that further discussion is desired. The wrapper is restricted that a lot of interaction with in-process API is pre-defined (i.e. how to handle released `TRITONSERVER_Request` and...

Upstream RAPIDS-Triton with name changes

@wphicks this PR is source from main branch of rapids-triton repo, I removed the [`squeeze_output`](https://github.com/rapidsai/rapids-triton/blob/main/cpp/include/rapids_triton/model/shared_state.hpp#L74-L81) as it seems to be just for backward compatibility. Other than that, all changes are...

refactor: Remove C shared memory shim. Refine shared memory utils behavior. Add unit test for shared memory

#### What does the PR do? Add unit test to constraint the behavior of shared memory utilities #### Checklist - [x] PR title reflects the change and is of format...