tensorrtllm_backend
tensorrtllm_backend copied to clipboard
Example of LoRa weights
I would like to send Lora weights through to a compiled tensor rt llm model but am unsure how to load the .bin weights and pass them to Triton. An example of using them and passing in weights would be very helpful