Kris Hung

Results 101 comments of Kris Hung

Closing issue due to lack of activity. Please re-open the issue if you would like to follow up with this.

Closing issue due to lack of activity. Please re-open the issue if you would like to follow up with this.

Hi @fran6co, could you also provide the full command you are using to run tritonserver? @GuanLuo Do you see anything which could help here?

Closing issue due to lack of activity. Please re-open the issue if you would like to follow up with this issue.

Thanks for submitting feature request! CC @nnshah1 on the request for starting server with the OpenAI compatible API. For the client side, we have introduced the [generate endpoint](https://github.com/triton-inference-server/server/blob/main/docs/protocol/extension_generate.md). which is...

Hi @apokerce, thanks for the repro steps, we will be looking into the memory issue. Meanwhile, could you also provide the full output from Valgrind? In our CI testing we...

@apokerce Thanks for providing the file. Can you also let me know what kind of hardware like GPU/device/framework that you are using? I wasn't able to repro the OOM issue...

Hi @apokerce, I ran the reproducer on A40 but still couldn't observe any memory growth. I used the following script to run perf_analyzer for several iterations and obtained memory usage...

Hi @apokerce, thank you for providing the files. I reran the test, and the results are basically the same. The memory usage for GRPC remains unchanged just like the above...

Thanks for the info, @apokerce ! I'm running the experiment with the command you provided to see if I could see the same. Bug fixes are included in newer version...