nv-guomingz comments

Results 27 comments of


                                            nv-guomingz

multi block mode performance issue

Please reopen this ticket if there's further discussion.

Detected layernorm nodes in FP16.

Please reopen this ticket if there's further discussion.

Detected layernorm nodes in FP16.

@akhoroshev Could we measure the trtllm engines' output quality via mmlu script?

Detected layernorm nodes in FP16.

> @nv-guomingz I have no problems with the quality of generation, but this warning is very annoying. Previously @byshiue said that it should run FP32 Thanks for confirming, let me...

Mixtral convertation OOM Fix

trt-llm will add the `--device` knob in coming release, then you can specify the `--device cpu` to avoid such oom issues.

KeyError: 'model.layers.0.self_attn.q_proj.qweight'

> Has this problem been solved? I have the same error when using a quantized mixtral model Hi @Mary-Sam could u please list more details/log on your issue? So we...

Support SDXL and its distributed inference

> @Zars19 thanks for the contribution to TensorRT-LLM! > > @nv-guomingz can you help take care of this? :) > > Thanks June Sure, I'll collobrate with @Zars19 for enabling...

Support SDXL and its distributed inference

Hi @Zars19 , could u please resolve the code conflicts firstly?

Support SDXL and its distributed inference

Hi @Zars19 thanks for your patience. Could u please update this MR by updating/rebasing those two commit(including one merge commit) into one commit which make us easy to integrate and...

Missing kernels for sm_87 (Jetson Orin AGX)

Close it now and you may reopen it as a feature request.