juney-nvidia

Results 117 comments of juney-nvidia

@byshiue can you help review this MR? Thanks June

@zmy1116 Hi, The error message consisting of `libtriton_tensorrt.so` indicates that you are trying to use the TensorRT backend to serve a specific model. And in TensorRT-LLM backend repo we haven't...

@jiahanc Hi Cyrus, I think you are the right person to answer this question? :) cc @NVGaryJi for vis also.

> LGTM. I don't have any approve button through Can you try again?

And let me trigger the CI since this MR although is small it can affect the test.

@Bihan Hi, pref caching(KV Cache reusing) is still being developed by our engineering team. I would expect that it can get landed into the main branch in the upcoming weeks....

Thanks for contributing this fix, @WilliamTambellini . Let me trigger the CI now. June