teis-e

Results 24 comments of teis-e

@dosu-bot Is this supported in the latest version?

Does it support for LLaMA now?

> I have heard that the architecture of Zephyr is very similar to LLama. Does tensorRT-LLM not work currently on Zephyr? > > I am hoping to understand what makes...

Please add CohereAI!! CohereForAI/c4ai-command-r-plus

Would 4 INT quant with fp16 work with multi-gpu on the 70B version? Has anyone tried it?

@njaramish Thnx!!! Do you know if it possible to build it quantized, since the model only fits quantized on multiple gpus. I tried this: ``` python3 convert_checkpoint.py --model_dir //root/.cache/huggingface/hub/models--Melon--Meta-Llama-3-70B-Instruct-AutoAWQ-4bit/snapshots/dc5cc4388d36c571d18f091e31decd82ab6621ed \...

But I have 3 GPU's is that an issue? 3x 4090