Ed Henry
Ed Henry
Using the [`inflight_batcher_llm`](https://github.com/triton-inference-server/tensorrtllm_backend/tree/main/all_models/inflight_batcher_llm) from [tensorrtllm_backend ](https://github.com/triton-inference-server/tensorrtllm_backend/tree/main) along with some modifications to the `preprocessing` model and tokenizer configurations, I was able to get the model functional within the TensorRT-LLM backend. **This...
I ran into this issue and modified my pipelines and the plugin to accommodate. Below is a summary of what I've done. 1. I ported my pipelines to use [modular_pipelines](https://docs.kedro.org/en/stable/nodes_and_pipelines/modular_pipelines.html)....
Would this also be the root cause of the issue I see here?:  Where I can trace the entire set of calls end-to-end, but when using them as part...
I can confirm it was related to how I was structuring some of my objects. Apologies for jumping in on this issue as it isn't related! > I've never seen...
Just ran into this issue over the last few weeks (again) and just want to give this a +1. :)