Nathan Price

Results 2 comments of Nathan Price

I have heard that the architecture of Zephyr is very similar to LLama. Does tensorRT-LLM not work currently on Zephyr? I am hoping to understand what makes a new arch....

I am experiencing similar issues I am using LLAMA3 8B with lora weights. I get significantly worse results when making calls concurrently than I do when running one at a...