QI JUN comments

Results 75 comments of


                                            QI JUN

PromptTuning can not work with block_reuse

Yes, it's expected. The prompt tuning can not work with block_reuse now.

problem with tensorrt_llm performance

@kaiyux Could you please have a look? Thanks

test: Add gpqa tests for DeepSeek models

cc @syuoni for vis, let's consider supporting gpqa task in the ongoing accuracy suite.

Support Gemma 1.1 model

Hi @ttim , if my understanding is correct, the `gelu_pytorch_tanh` should be equal to `gelu` activation function, they are different implementation. Could you please share the error log when building...

Support Gemma 1.1 model

@ttim ， Yes, I think so. Could you please submit a MR to fix it? Or you prefer to waiting for us to fix it?

Is MPI required even multi device is disabled?

@Funatiq Could you please have a look? Thanks

[Feature] quantize_by_modelopt.py get_tokenizer is not suitable for CodeQwen1.5 7B Chat

@Tracin Could you please have a look? Thanks

test: [TRTLLM-4000] Port multi GPU changes to GitHub

/bot run --add-multi-gpu-test

feat: Add EXAONE-Deep

/bot run

feat: Add EXAONE-Deep

/bot skip --comment "CI passed"