Katherine Yang

Results 100 comments of Katherine Yang

Closing this issue due to lack of activity. If this issue needs follow-up, please let us know and we can reopen it for you.

> > > Cant I use an existing docker file with dependencies listed for triton? > > > > > > The problem is that the Dockerfile for the `min`...

Also, there's an official support path for Nvidia products that could potentially help you with this issue. https://www.nvidia.com/en-us/data-center/products/ai-enterprise/#benefits

Hi, we're still working on this issue with GKE about this.

Closing this issue due to lack of activity. If this issue needs follow-up, please let us know and we can reopen it for you.

@zhaozhiming37 you can read up about how to use CUDA shared memory here: https://github.com/triton-inference-server/client#cuda-shared-memory and https://github.com/triton-inference-server/client#download-docker-image-from-ngc You can find examples here: https://github.com/triton-inference-server/client/blob/main/src/python/examples/simple_http_shm_client.py When you are sending new requests you set...

^ @kthui any idea why this might be happening?

Hello. Looking into this now. As an FYI, this will be added to the 24.05 release instead of 24.04 since it was not prioritized for this release.

Merging with https://github.com/triton-inference-server/client/pull/465 and tested with https://github.com/triton-inference-server/server/pull/7123 after pre-commit passes

Thanks for your change. Can you fill a [CLA](https://github.com/triton-inference-server/server/blob/main/CONTRIBUTING.md#contributor-license-agreement-cla) otherwise LGTM