nv-guomingz
nv-guomingz
Please reopen this ticket if there's further discussion.
Please reopen this ticket if there's further discussion.
@akhoroshev Could we measure the trtllm engines' output quality via mmlu script?
> @nv-guomingz I have no problems with the quality of generation, but this warning is very annoying. Previously @byshiue said that it should run FP32 Thanks for confirming, let me...
trt-llm will add the `--device` knob in coming release, then you can specify the `--device cpu` to avoid such oom issues.
> Has this problem been solved? I have the same error when using a quantized mixtral model Hi @Mary-Sam could u please list more details/log on your issue? So we...
> @Zars19 thanks for the contribution to TensorRT-LLM! > > @nv-guomingz can you help take care of this? :) > > Thanks June Sure, I'll collobrate with @Zars19 for enabling...
Hi @Zars19 , could u please resolve the code conflicts firstly?
Hi @Zars19 thanks for your patience. Could u please update this MR by updating/rebasing those two commit(including one merge commit) into one commit which make us easy to integrate and...
Close it now and you may reopen it as a feature request.